面向小样本抽取式问答的多标签语义校准方法

doi:10.3969/j.issn.0255-8297.2024.01.013

摘要/Abstract

摘要： 小样本抽取式问答任务旨在利用文章给定的上下文片段，抽取出真实的答案片段。其基线模型采用的方法只针对跨度进行学习，缺乏对全局语义信息的利用，在含有多组不同重复跨度的实例中存在着理解偏差等问题。为了解决上述问题，该文利用不同层级的语义提出了一种面向小样本抽取式问答任务的多标签语义校准方法。采用包含全局语义信息的头标签和基线模型中的特殊字符构成多标签进行语义融合，并利用语义融合门来控制全局信息流的引入，将全局语义信息融合到特殊字符的语义信息中。然后，利用语义筛选门对新融入的全局语义信息和该特殊字符的原有语义信息进行保留与更替，实现对标签偏差语义的校准。在8个小样本抽取式问答数据集中的56组实验结果表明：该方法在评价指标F1值上均明显优于基线模型，证明了所提方法的有效性和先进性。

关键词: 小样本抽取式问答, 跨度抽取式问答, 多标签语义融合, 双门控机制, 机器阅读理解

Abstract: biases, especially in instances involving multiple sets of distinct repeated spans. Therefore, this paper proposes a multi-label semantic calibration method for few-shot extractive QA to mitigate the above issues. Specifically, this method uses the head label, which contains global semantic information, and the special character in the baseline model to form a multi-label for semantic fusion. The semantic fusion gate is then used to control the introduction of global information flow to integrate global semantic information into the semantic information of the special character. Next, the semantic selection gate is used to retain or replace the newly integrated global semantic information and the original semantic information of the special character, achieving semantic adjustment of label bias. The results of 56 experiments on 8 few-shot extractive QA datasets consistently outperformed the baseline model in terms of the evaluation metric F1 score. This demonstrates the effectiveness and advancement of the method.

Key words: few-shot extraction question answering, span extraction question answering, multi-label semantic fusion, dual gating mechanism, machine reading comprehension

中图分类号:

P751.1

刘青, 陈艳平, 邹安琪, 秦永彬, 黄瑞章. 面向小样本抽取式问答的多标签语义校准方法[J]. 应用科学学报, 2024, 42(1): 161-173.

LIU Qing, CHEN Yanping, ZOU Anqi, QIN Yongbin, HUANG Ruizhang. A Multi-label Semantic Calibration Method for Few Shot Extractive Question[J]. Journal of Applied Sciences, 2024, 42(1): 161-173.

参考文献

[1] 包玥, 李艳玲, 林民. 抽取式机器阅读理解研究综述[J]. 计算机工程与应用, 2021, 57(12): 25-36. Bao Y, Li Y L, Lin M. Review of extractive machine reading comprehension [J]. Computer Engineering and Applications, 2021, 57(12): 25-36. (in Chinese)
[2] 张超然, 裘杭萍, 孙毅, 等. 基于预训练模型的机器阅读理解研究综述[J]. 计算机工程与应用, 2020, 56(11): 17-25. Zhang C R, Qiu H P, Sun Y, et al. Review of machine reading comprehension based on pre-training language model [J]. Computer Engineering and Applications, 2020, 56(11): 17-25. (in Chinese)
[3] Devlin J, Chang M W, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding [DB/OL]. 2019[2023-06-29]. https://arxiv.org/abs/1810.04805.
[4] Joshi M, Chen D Q, Liu Y H, et al. SpanBERT: improving pre-training by representing and predicting spans [J]. Transactions of the Association for Computational Linguistics, 2020, 8: 64-77.
[5] Liu Y H, Ott M, Goyal N, et al. RoBERTa: a robustly optimized BERT pretraining approach [DB/OL]. 2019[2023-06-29]. https://arxiv.org/abs/1907.11692.
[6] Rajpurkar P, Zhang J A, Lopyrev K, et al. SQuAD: 100, 000+ questions for machine comprehension of text [C]//The 2016 Conference on Empirical Methods in Natural Language Processing, 2016: 2383-2392.
[7] Ram O, Kirstain Y, Berant J, et al. Few-shot question answering by pretraining span selection [C]//The 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021: 3066-3079.
[8] Wu L, Li J, Wang Y, et al. R-drop: regularized dropout for neural networks [J]. Advances in Neural Information Processing Systems, 2021, 34: 10890-10905.
[9] Trischler A, Wang T, Yuan X D, et al. NewsQA: a machine comprehension dataset [C]//The 2nd Workshop on Representation Learning for NLP, 2017: 91-200.
[10] Kembhavi A, Seo M, Schwenk D, et al. Are you smarter than a sixth grader? textbook question answering for multimodal machine comprehension [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2017: 5376-5384.
[11] Lewis P, Oguz B, Rinott R, et al. MLQA: evaluating cross-lingual extractive question answering [DB/OL]. 2019[2023-06-29]. https://arxiv.linfen3.top/abs/1910.07475.
[12] Clark J H, Choi E, Collins M, et al. TyDi QA: a benchmark for information-seeking question answering in typologically diverse languages [J]. Transactions of the Association for Computational Linguistics, 2020, 8: 454-470.
[13] Levy O, Seo M, Choi E, et al. Zero-shot relation extraction via reading comprehension [C]//The 21st Conference on Computational Natural Language, 2017: 333-342.
[14] Hewlett D, Lacoste A, Jones L, et al. WikiReading: a novel large-scale language understanding task over wikipedia [C]//The 54th Annual Meeting of the Association for Computational Linguistics, 2016: 1535-1545.
[15] Rajpurkar P, Jia R, Liang P. Know what you don't know: unanswerable questions for SQuAD [C]//The 56th Annual Meeting of the Association for Computational Linguistics, 2018: 784-789.
[16] Dua D, Wang Y Z, Dasigi P, et al. DROP: a reading comprehension benchmark requiring discrete reasoning over paragraphs [DB/OL]. 2019[2023-06-29]. https://arxiv.org/abs/1903.00161.
[17] Dasigi P, Liu N F, Marasovi'c A, et al. Quoref: a reading comprehension dataset with questions requiring coreferential reasoning [C]//Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019: 5925-5932.
[18] Fisch A, Talmor A, Jia R, et al. MRQA 2019 shared task: evaluating generalization in reading comprehension [C]//The 2nd Workshop on Machine Reading for Question Answering, 2019: 1-13.
[19] Wang W, Yang N, Wei F, et al. Gated self-matching networks for reading comprehension and question answering [C]//The 55th Annual Meeting of the Association for Computational Linguistics, 2017: 189-198.
[20] Wang S, Jiang J. Machine comprehension using match- LSTM and answer pointer [DB/OL]. 2019[2023-06-29]. https://arxiv.org/abs/1608.07905.
[21] Tom B, Benjamin M, Nick R, et al. Language models are few-shot learners [C]//Advances in Neural Information Processing Systems, 2020: 1877-1901.
[22] Yasunaga M, Leskovec J, Liang P. LinkBERT: pretraining language models with document links [C]//The 60th Annual Meeting of the Association for Computational Linguistics, 2022: 8003-8016.
[23] Rakesh C, Pradeep N. FewshotQA: a simple framework for few-shot learning of question answering tasks using pre-trained text-to-text models [DB/OL]. 2021[2023-06-29]. https://arxiv.org/abs/2109.01951.
[24] Wang J N, Wang C Y, Qiu M H, et al. KECP: knowledge enhanced contrastive prompting for few-shot extractive question answering [DB/OL]. 2022[2023-06-29]. https://arxiv.linfen3.top/abs/2205.03071.
[25] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need [C]//Advances in Neural Information Processing Systems, 2017: 6000-6010.
[26] Cho K, Bart V M, Gulcehre C, et al. Learning phrase representations using RNN encoder– decoder for statistical machine translation [C]//Conference on Empirical Methods in Natural Language Processing, 2014: 1724-1734.
[27] Joshi M, Choi E, Weld D, et al. TriviaQA: a large scale distantly supervised challenge dataset for reading comprehension [DB/OL]. 2017[2023-06-29]. https://arxiv.linfen3.top/abs/1705.03551.
[28] Dunn M, Sagun L, Higgins M, et al. SearchQA: a new Q&A dataset augmented with context from a search engine [DB/OL]. 2019[2023-06-29]. https://arxiv.org/abs/1704.05179.
[29] Yang Z L, Qi P, Zhang S, et al. HotpotQA: a dataset for diverse, explainable multi-hop question answering [C]//Conference on Empirical Methods in Natural Language Processing, 2018: 2369-2380.
[30] Kwiatkowski T, Palomaki J, Redfield O, et al. Natural questions: a benchmark for question answering research [C]//Transactions of the Association for Computational Linguistics, 2019: 7: 453-466.
[31] Tsatsaronis G, Balikas G, Malakasiotis P, et al. An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition [J]. BMC Bioinformatics, 2015, 16: 1-28.
[32] 杜永萍, 赵以梁, 阎婧雅, 等. 基于深度学习的机器阅读理解研究综述[J]. 智能系统学报, 2022, 17(6): 1074-1083. Du Y P, Zhao Y L, Yan J Y, et al. Survey of machine reading comprehension based on deep learning [J]. CAAI Transactions on Intelligent Systems, 2022, 17(6): 1074-1083. (in Chinese)