论文标题
使用结构化学习合奏和上下文化的嵌入在湿实验室协议中在Wnut 2020共享任务1:实体识别1:实体识别19:
PublishInCovid19 at WNUT 2020 Shared Task-1: Entity Recognition in Wet Lab Protocols using Structured Learning Ensemble and Contextualised Embeddings
论文作者
论文摘要
在本文中,我们描述了我们为解决湿实验室协议的实体识别任务的方法 - EMNLP WNUT-2020研讨会中的共同任务。我们的方法由两个阶段组成。在第一阶段,我们尝试了各种上下文化的单词嵌入(例如Flair,基于BERT)和BILSTM-CRF模型,以达到表现最佳的体系结构。在第二阶段,我们创建了一个由11个Bilstm-CRF模型组成的合奏。单个模型经过完整数据集的随机火车验证拆分培训。在这里,我们还尝试了不同的输出合并方案,包括多数投票和结构化学习结合(SLE)。我们的最终提交分别达到了实体跨度的部分和精确匹配的微型F1得分为0.8175和0.7757。就部分和精确匹配而言,我们排名第一和第二。
In this paper, we describe the approach that we employed to address the task of Entity Recognition over Wet Lab Protocols -- a shared task in EMNLP WNUT-2020 Workshop. Our approach is composed of two phases. In the first phase, we experiment with various contextualised word embeddings (like Flair, BERT-based) and a BiLSTM-CRF model to arrive at the best-performing architecture. In the second phase, we create an ensemble composed of eleven BiLSTM-CRF models. The individual models are trained on random train-validation splits of the complete dataset. Here, we also experiment with different output merging schemes, including Majority Voting and Structured Learning Ensembling (SLE). Our final submission achieved a micro F1-score of 0.8175 and 0.7757 for the partial and exact match of the entity spans, respectively. We were ranked first and second, in terms of partial and exact match, respectively.