【摘 要】
:
Deep leaing models have achieved state-of-the-art performance in named entity recognition (NER);the good performance,however,relies heavily on substantial amoun
【机 构】
:
College of Computer,National University of Defense Technology,Changsha 410073,China
论文部分内容阅读
Deep leaing models have achieved state-of-the-art performance in named entity recognition (NER);the good performance,however,relies heavily on substantial amounts of labeled data.In some specific areas such as medical,financial,and military domains,labeled data is very scarce,while unlabeled data is readily available.Previous studies have used unlabeled data to enrich word representations,but a large amount of entity information in unlabeled data is neglected,which may be beneficial to the NER task.In this study,we propose a semi-supervised method for NER tasks,which leas to create high-quality labeled data by applying a pre-trained module to filter out erroneous pseudo labels.Pseudo labels are automatically generated for unlabeled data and used as if they were true labels.Our semi-supervised framework includes three steps: constructing an optimal single neural model for a specific NER task,leaing a module that evaluates pseudo labels,and creating new labeled data and improving the NER model iteratively.Experimental results on two English NER tasks and one Chinese clinical NER task demonstrate that our method further improves the performance of the best single neural model.Even when we use only pre-trained static word embeddings and do not rely on any exteal knowledge,our method achieves comparable performance to those state-of-the-art models on the CoNLL-2003 and OntoNotes 5.0 English NER tasks.
其他文献
本项目将人工半合成的人类胰岛素基因转化芦荟,并研究该基因在植物体内表达;试图获得有人类胰岛素表达、对糖尿病等疾病具有食疗作用的芦荟新品种,为人类疾病的防治提供一条经济有效的新途径;为以芦荟为植物反应器,生产人类胰岛素奠定基础;使芦荟这一热带药用植物资源能够更好地服务于人类。 植物表达载体的构建。选用单子叶植物启动子Ubiquitin和空白载体PUC1301,PVKH-35S-GUS-PA构建
Over the past two decades, several fluorescence- and non-fluorescence-based optical microscopes have been developed to break the diffraction limited barrier. In
We investigate the problem of fi nding optimal one-bit perturbation that maximizes the size of the basin of attractions (BOAs) of desired attractors and minimiz
In the envisioned smart grid, high penetration of uncertain renewables, unpredictable participation of (industrial) customers, and purposeful manipulation of sm
玉米抗病基因工程被认为是防治玉米病毒病的有效途径。克服玉米幼胚培养对基因型的依赖,提高玉米愈伤组织再生能力是玉米基因工程的基础。本研究从优化玉米幼胚培养遗传转化受体系统和抗病毒基因PAP的转化两个方面进行研究。以15种基因型的玉米为试验材料,通过对玉米幼胚进行变温处理显著提高了愈伤组织诱导率和整齐度,提高了愈伤组织质量;通过在分化前对愈伤组织进行干燥处理48h、震荡洗涤48h+干燥处理48h可以普
As a versatile tool for trapping and manipulating neutral particles, optical tweezers have been studied in a broad range of fields such as molecular biology, na
A grating interferometer, called the "optical encoder," is a commonly used tool for precise displacement measurements. In contrast to a laser interferometer, a
Cyber-physical systems (CPSs) are distributed assemblages of computing, communicating, and physical components that sense their environment, algorithmically ass