20240104论文报告-Qilin-Med: Multi-stage Knowledge Injection Advanced Medical Large Language Model

当前位置: 首页 2014贵州省先进计算与医疗信息服务工程实验室通知公告

报告题目：Qilin-Med: Multi-stage Knowledge Injection Advanced Medical Large Language Model

作者：Qichen Ye1†, Junling Liu2†∗, Dading Chong1†, Peilin Zhou3†, Yining Hua4, Andrew Liu1

单位：1Peking University, 2Alibaba Group, 3Hong Kong University of Science and Technology (Guangzhou), 4Harvard T.H. Chan School of Public Health

报告人：张芊

报告时间：2024年1月4日

报告地点：博学楼621会议室

报告内容摘要：Integrating large language models (LLMs) into healthcare presents potential but faces challenges. Directly pre-training LLMs for domains like medicine is resource-heavy and sometimes unfeasible. Sole reliance on Supervised Fine-tuning (SFT) can result in overconfident predictions and may not tap into domain-specific insights. Addressing these challenges, we present a multi-stage training method combining Domain-specific Continued Pre-training (DCPT), SFT, and Direct Preference Optimization (DPO). A notable contribution of our study is the introduction of a 3Gb Chinese Medicine (ChiMed) dataset, encompassing medical question answering, plain texts, knowledge graphs, and dialogues, segmented into three training stages. The medical LLM trained with our pipeline, Qilin-Med, exhibits significant performance boosts. In the CPT and SFT phases, it achieves 38.4% and 40.0% accuracy on the CMExam, surpassing Baichuan-7B’s 33.5%. In the DPO phase, on the Huatuo-26M test set, it scores 16.66 in BLEU-1 and 27.44 in ROUGE-1, outperforming the SFT’s 12.69 and 24.21. This highlights the strength of our training approach in refining LLMs for medical applications.

【关闭本页】　【返回顶部】