当前位置: 首页  2014贵州省先进计算与医疗信息服务工程实验室  通知公告
20221010论文报告-RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering

报告题目:RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering

论文出处:NAACL

作者:Yingqi Qu, Yuchen Ding, Jing Liu1, Kai Liu, Ruiyang Ren , Wayne Xin Zhao, Daxiang Dong, Hua Wu and Haifeng Wang  

单位:Baidu Inc.Gaoling School of Artificial Intelligence, Renmin University of China. 

报告人:李佳丽

报告时间:2022年10月10日 

报告地点:贵州大学北校区博学楼624室

报告内容摘要:In open-domain question answering, dense passage retrieval has become a new paradigm to retrieve relevant passages for finding answers. Typically, the dual-encoder architecture is adopted to learn dense representations of questions and passages for semantic matching. However, it is difficult to effectively train a dual-encoder due to the challenges including the discrepancy between training and inference, the existence of unlabeled positives and limited training data. To address these challenges, we propose an optimized training approach, called RocketQA, to improving dense passage retrieval. We make three major technical contributions in RocketQA, namely crossbatch negatives, denoised hard negatives and data augmentation. The experiment results show that RocketQA significantly outperforms previous state-of-the-art models on both MSMARCO and Natural Questions. We also conduct extensive experiments to examine the effectiveness of the three strategies in RocketQA. Besides, we demonstrate that the performance of end-to-end QA can be improved based on our RocketQA retriever.


【关闭本页】 【返回顶部】