应用科学学报 ›› 2024, Vol. 42 ›› Issue (1): 161-173.doi: 10.3969/j.issn.0255-8297.2024.01.013
刘青1,2, 陈艳平1,2, 邹安琪1,2, 秦永彬1,2, 黄瑞章1,2
收稿日期:2023-06-29
出版日期:2024-01-30
发布日期:2024-02-02
通信作者:
陈艳平,教授,研究方向为人工智能、自然语言处理。E-mail:ypench@gmail.com
E-mail:ypench@gmail.com
基金资助:LIU Qing1,2, CHEN Yanping1,2, ZOU Anqi1,2, QIN Yongbin1,2, HUANG Ruizhang1,2
Received:2023-06-29
Online:2024-01-30
Published:2024-02-02
摘要: 小样本抽取式问答任务旨在利用文章给定的上下文片段,抽取出真实的答案片段。其基线模型采用的方法只针对跨度进行学习,缺乏对全局语义信息的利用,在含有多组不同重复跨度的实例中存在着理解偏差等问题。为了解决上述问题,该文利用不同层级的语义提出了一种面向小样本抽取式问答任务的多标签语义校准方法。采用包含全局语义信息的头标签和基线模型中的特殊字符构成多标签进行语义融合,并利用语义融合门来控制全局信息流的引入,将全局语义信息融合到特殊字符的语义信息中。然后,利用语义筛选门对新融入的全局语义信息和该特殊字符的原有语义信息进行保留与更替,实现对标签偏差语义的校准。在8个小样本抽取式问答数据集中的56组实验结果表明:该方法在评价指标F1值上均明显优于基线模型,证明了所提方法的有效性和先进性。
中图分类号:
刘青, 陈艳平, 邹安琪, 秦永彬, 黄瑞章. 面向小样本抽取式问答的多标签语义校准方法[J]. 应用科学学报, 2024, 42(1): 161-173.
LIU Qing, CHEN Yanping, ZOU Anqi, QIN Yongbin, HUANG Ruizhang. A Multi-label Semantic Calibration Method for Few Shot Extractive Question[J]. Journal of Applied Sciences, 2024, 42(1): 161-173.
| [1] 包玥, 李艳玲, 林民. 抽取式机器阅读理解研究综述[J]. 计算机工程与应用, 2021, 57(12): 25-36. Bao Y, Li Y L, Lin M. Review of extractive machine reading comprehension [J]. Computer Engineering and Applications, 2021, 57(12): 25-36. (in Chinese) [2] 张超然, 裘杭萍, 孙毅, 等. 基于预训练模型的机器阅读理解研究综述[J]. 计算机工程与应用, 2020, 56(11): 17-25. Zhang C R, Qiu H P, Sun Y, et al. Review of machine reading comprehension based on pre-training language model [J]. Computer Engineering and Applications, 2020, 56(11): 17-25. (in Chinese) [3] Devlin J, Chang M W, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding [DB/OL]. 2019[2023-06-29]. https://arxiv.org/abs/1810.04805. [4] Joshi M, Chen D Q, Liu Y H, et al. SpanBERT: improving pre-training by representing and predicting spans [J]. Transactions of the Association for Computational Linguistics, 2020, 8: 64-77. [5] Liu Y H, Ott M, Goyal N, et al. RoBERTa: a robustly optimized BERT pretraining approach [DB/OL]. 2019[2023-06-29]. https://arxiv.org/abs/1907.11692. [6] Rajpurkar P, Zhang J A, Lopyrev K, et al. SQuAD: 100, 000+ questions for machine comprehension of text [C]//The 2016 Conference on Empirical Methods in Natural Language Processing, 2016: 2383-2392. [7] Ram O, Kirstain Y, Berant J, et al. Few-shot question answering by pretraining span selection [C]//The 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021: 3066-3079. [8] Wu L, Li J, Wang Y, et al. R-drop: regularized dropout for neural networks [J]. Advances in Neural Information Processing Systems, 2021, 34: 10890-10905. [9] Trischler A, Wang T, Yuan X D, et al. NewsQA: a machine comprehension dataset [C]//The 2nd Workshop on Representation Learning for NLP, 2017: 91-200. [10] Kembhavi A, Seo M, Schwenk D, et al. Are you smarter than a sixth grader? textbook question answering for multimodal machine comprehension [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2017: 5376-5384. [11] Lewis P, Oguz B, Rinott R, et al. MLQA: evaluating cross-lingual extractive question answering [DB/OL]. 2019[2023-06-29]. https://arxiv.linfen3.top/abs/1910.07475. [12] Clark J H, Choi E, Collins M, et al. TyDi QA: a benchmark for information-seeking question answering in typologically diverse languages [J]. Transactions of the Association for Computational Linguistics, 2020, 8: 454-470. [13] Levy O, Seo M, Choi E, et al. Zero-shot relation extraction via reading comprehension [C]//The 21st Conference on Computational Natural Language, 2017: 333-342. [14] Hewlett D, Lacoste A, Jones L, et al. WikiReading: a novel large-scale language understanding task over wikipedia [C]//The 54th Annual Meeting of the Association for Computational Linguistics, 2016: 1535-1545. [15] Rajpurkar P, Jia R, Liang P. Know what you don't know: unanswerable questions for SQuAD [C]//The 56th Annual Meeting of the Association for Computational Linguistics, 2018: 784-789. [16] Dua D, Wang Y Z, Dasigi P, et al. DROP: a reading comprehension benchmark requiring discrete reasoning over paragraphs [DB/OL]. 2019[2023-06-29]. https://arxiv.org/abs/1903.00161. [17] Dasigi P, Liu N F, Marasovi'c A, et al. Quoref: a reading comprehension dataset with questions requiring coreferential reasoning [C]//Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019: 5925-5932. [18] Fisch A, Talmor A, Jia R, et al. MRQA 2019 shared task: evaluating generalization in reading comprehension [C]//The 2nd Workshop on Machine Reading for Question Answering, 2019: 1-13. [19] Wang W, Yang N, Wei F, et al. Gated self-matching networks for reading comprehension and question answering [C]//The 55th Annual Meeting of the Association for Computational Linguistics, 2017: 189-198. [20] Wang S, Jiang J. Machine comprehension using match- LSTM and answer pointer [DB/OL]. 2019[2023-06-29]. https://arxiv.org/abs/1608.07905. [21] Tom B, Benjamin M, Nick R, et al. Language models are few-shot learners [C]//Advances in Neural Information Processing Systems, 2020: 1877-1901. [22] Yasunaga M, Leskovec J, Liang P. LinkBERT: pretraining language models with document links [C]//The 60th Annual Meeting of the Association for Computational Linguistics, 2022: 8003-8016. [23] Rakesh C, Pradeep N. FewshotQA: a simple framework for few-shot learning of question answering tasks using pre-trained text-to-text models [DB/OL]. 2021[2023-06-29]. https://arxiv.org/abs/2109.01951. [24] Wang J N, Wang C Y, Qiu M H, et al. KECP: knowledge enhanced contrastive prompting for few-shot extractive question answering [DB/OL]. 2022[2023-06-29]. https://arxiv.linfen3.top/abs/2205.03071. [25] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need [C]//Advances in Neural Information Processing Systems, 2017: 6000-6010. [26] Cho K, Bart V M, Gulcehre C, et al. Learning phrase representations using RNN encoder– decoder for statistical machine translation [C]//Conference on Empirical Methods in Natural Language Processing, 2014: 1724-1734. [27] Joshi M, Choi E, Weld D, et al. TriviaQA: a large scale distantly supervised challenge dataset for reading comprehension [DB/OL]. 2017[2023-06-29]. https://arxiv.linfen3.top/abs/1705.03551. [28] Dunn M, Sagun L, Higgins M, et al. SearchQA: a new Q&A dataset augmented with context from a search engine [DB/OL]. 2019[2023-06-29]. https://arxiv.org/abs/1704.05179. [29] Yang Z L, Qi P, Zhang S, et al. HotpotQA: a dataset for diverse, explainable multi-hop question answering [C]//Conference on Empirical Methods in Natural Language Processing, 2018: 2369-2380. [30] Kwiatkowski T, Palomaki J, Redfield O, et al. Natural questions: a benchmark for question answering research [C]//Transactions of the Association for Computational Linguistics, 2019: 7: 453-466. [31] Tsatsaronis G, Balikas G, Malakasiotis P, et al. An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition [J]. BMC Bioinformatics, 2015, 16: 1-28. [32] 杜永萍, 赵以梁, 阎婧雅, 等. 基于深度学习的机器阅读理解研究综述[J]. 智能系统学报, 2022, 17(6): 1074-1083. Du Y P, Zhao Y L, Yan J Y, et al. Survey of machine reading comprehension based on deep learning [J]. CAAI Transactions on Intelligent Systems, 2022, 17(6): 1074-1083. (in Chinese) |
| [1] | 陈昱胤, 李贯峰, 秦晶, 肖毓航. 基于改进Transformer的复杂逻辑查询模型[J]. 应用科学学报, 2026, 44(1): 34-49. |
| [2] | 江泽涛, 杨建琛, 李孟桐, 程留明, 张路豪. 一种基于边缘特征引导的低照度图像细节增强方法[J]. 应用科学学报, 2025, 43(6): 948-961. |
| [3] | 李依静, 闻建刚, 邹园萍, 华惊宇, 盛彬. OFDM系统中基于遗传算法的SR-NYQ脉冲成形滤波器设计[J]. 应用科学学报, 2025, 43(5): 730-739. |
| [4] | 卢菁, 葛聪. 结合通用轨迹图和多偏好的兴趣点推荐方法[J]. 应用科学学报, 2025, 43(5): 771-784. |
| [5] | 宋轶旻, 刘功申. 基于文本水印的AIGC用户溯源技术[J]. 应用科学学报, 2025, 43(3): 361-369. |
| [6] | 张庆玲, 倪翠, 王朋, 巩慧. 一种基于改进深度确定性策略梯度的移动机器人路径规划算法[J]. 应用科学学报, 2025, 43(3): 415-436. |
| [7] | 金彦亮, 方洁, 高塬, 周嘉豪. 基于对比学习的半监督加密流量分类模型[J]. 应用科学学报, 2025, 43(3): 437-450. |
| [8] | 汪婉灵, 熊邦书, 欧巧凤, 余磊, 饶智博. 基于暗区域引导的低照度图像增强[J]. 应用科学学报, 2025, 43(2): 245-256. |
| [9] | 成佳, 陈玲姣, 吴岳忠. 基于相对信任增强的推荐算法[J]. 应用科学学报, 2025, 43(1): 110-122. |
| [10] | 王华, 何群, 谭如超, 武冬, 方一诺, 马跃辉, 严开全, 牟成博. 基于锁模激光器的FBG电流传感器高速解调系统[J]. 应用科学学报, 2024, 42(6): 903-911. |
| [11] | 江泽涛, 黄景帆, 朱文才, 黄钦阳, 金鑫. 基于CRTNet的低照度图像增强方法[J]. 应用科学学报, 2024, 42(6): 934-946. |
| [12] | 卢菁, 尤晨璐, 盖祺凯, 刘丛. 利用邻近度与内容特征的用户识别方法[J]. 应用科学学报, 2024, 42(6): 1064-1077. |
| [13] | 鹿靖, 李游. 基于稳态集-射极饱和电压的IGBT功率模块疲劳失效模型的研究[J]. 应用科学学报, 2024, 42(6): 1078-1088. |
| [14] | 施智罡, 黄建华, 李天琪. 一种面向车联网的区块链模型[J]. 应用科学学报, 2024, 42(4): 549-568. |
| [15] | 刘凯, 王佳鑫, 毛谦昂, 陈煜菲, 颜嘉麒. 区块链游戏生态的角色动态识别与演化分析——以Axie Infinity为例[J]. 应用科学学报, 2024, 42(4): 642-658. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||