在中文事件触发词抽取任务中,基于词的模型会受到分词带来的错误,而基于字符的模型则难以捕获触发词的结构信息和上下文语义信息,为此提出了一种基于跨度回归的触发词抽取方法。该方法考虑到句子中特定长度的字符子序列(跨度)可能构成一个事件触发词,用基于Transformer的双向编码器的预训练语言模型获取句子的特征表示,进而生成触发词候选跨度;然后用一个分类器过滤低置信度的候选跨度,通过回归调整候选跨度的边界来准确定位触发词;最后对调整后的候选跨度进行分类得到抽取结果。在ACE2005中文数据集上的实验结果表明:基于跨度回归的方法对触发词识别任务的F1值为73.20%,对触发词分类任务的F1值为71.60%,优于现有模型;并与仅基于跨度的方法进行对比,验证了对跨度边界进行回归调整可以提高事件触发词检测的准确性。
In Chinese event trigger word extraction tasks, word-based models suffer from errors caused by word separation, while character-based models have difficulty in capturing the structural and contextual semantic information of trigger words. In view of the problem, a spanwise regression-based trigger word extraction method is proposed. Considering that a specific length of character subsequence (span) in a sentence may constitute an event trigger word, the method obtains the feature representation of the sentence with a pre-trained model of bidirectional encoder representation from Transformer (BERT), and generates the candidate span of the trigger word on the sentence feature representation. Then the model filters the candidate span with low confidence using a classifier, and adjusts the boundaries of the candidate span by regression to accurately locate the trigger word. Finally, the adjusted candidate spans are classified, and extraction results are obtained. Experimental results on the ACE2005 Chinese dataset show that the F1 value of the span-based regression method is 73.20% for trigger word recognition task and 71.60% for trigger word classification task, better than existing models. Also, experimental comparison with span-based method without regression verifies that the regression adjustment of span boundaries can improve the accuracy of event trigger word detection.
[1] Doddington G R, Mitchell A, Przybocki M A, et al. The automatic content extraction (ACE) program-tasks, data, and evaluation[J]. Lrec, 2004, 2(1):837-840.
[2] Nguyen T H, Grishman R. Event detection and domain adaptation with convolutional neural networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Volume 2(Short Papers), 2015:365-371.
[3] Ghaeini R, Fern X, Huang L, et al. Event nugget detection with forward-backward recurrent neural networks[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Volume 2(Short Papers), 2016:369-373.
[4] Feng X, Qin B, Liu T. A language-independent neural network for event detection[J]. Science China (Information Sciences), 2018, 61(9):1-12.
[5] Nguyen T H, Cho K, Grishman R. Joint event extraction via recurrent neural networks[C]//Proceedings of 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, 2016:300-309.
[6] Chen Y, Xu L, Liu K, et al. Event extraction via dynamic multi-pooling convolutional neural networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Volume 1(Long Papers), 2015:167-176.
[7] Wang Z, Zhao D. A convolution BiLSTM neural network model for Chinese event extraction[C]//Natural Language Understanding and Intelligent Applications:the 5th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2016, and the 24th International Conference on Computer Processing of Oriental Languages, Kunming, China, December 2-6, 2016, 10102:275.
[8] Xi X Y, Zhang T, Ye W, et al. A hybrid character representation for Chinese event detection[C]//2019 International Joint Conference on Neural Networks, IEEE, 2019:1-8.
[9] Lin H, Lu Y, Han X, et al. Nugget proposal networks for Chinese event detection[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Volume 1(Long Papers), 2018:1565-1574.
[10] Ding N, Li Z, Liu Z, et al. Event detection with trigger-aware lattice neural network[C]//Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019:347-356.
[11] Chen Y, Wu L, Zheng Q, et al. A boundary regression model for nested named entity recognition[J]. Cognitive Computation, 2022:1-18.
[12] Shen Y, Ma X, Tan Z, et al. Locate and label:a two-stage identifier for nested named entity recognition[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Volume 1(Long Papers), 2021:2782-2794.
[13] Devlin J, Chang M W, Lee K, et al. BERT:pre-training of deep bidirectional transformers for language understanding[EB/OL]. (2019-05-24)[2022-06-15]. https://arxiv.org/abs/1810.04805v1.
[14] Wang X, Shrivastava A, Gupta A. A-fast-RCNN:hard positive generation via adversary for object detection[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017:2606-2615.
[15] Wu C, Wu F, Wu S, et al. THU_NGN at SemEval-2018 task 10:capturing discriminative attributes with MLP-CNN model[C]//Proceedings of the 12th International Workshop on Semantic Evaluation, 2018:958-962.
[16] Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]//Proceedings of IEEE International Conference on Computer Vision, 2017:2980-2988.
[17] Ren S, He K, Girshick R, et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6):1137-1149.