知识库与多模型协同驱动的循证护理知识问答智能体的构建与评价研究

doi:10.3761/j.issn.1672-9234.2025.09.002

摘要/Abstract

摘要：

目的构建知识库与多模型协同驱动的循证护理知识问答智能体，并系统比较其与大模型性能的差异。方法构建循证护理知识问答智能体，主要步骤包括：基于Coze低代码平台开发智能问答系统，整合《循证护理学》教材构建针对性知识库，采用检索增强生成技术实现知识检索，并设计包含2个基础模型、1个专家模型和1个分析模型的协作工作流，采用“硬投票”与“专家模型优先”的决策规则生成最终输出。采用复旦大学循证护理中心标准随堂测验，对比分析不同模型[循证护理知识问答智能体、深度求索（DeepSeek）、Kimi、ChatGPT-4o Mini]在整体、不同类章、不同难度（1~5级）题目下的答题表现差异。结果在总体答题正确率上，循证护理知识问答智能体优于三大主流模型（P<0.05）。不同难度题目下，循证护理知识问答智能体在3级难度题目上的答题得分优于三大主流模型（P<0.05）。结论构建的循证护理知识问答智能体在循证护理领域具有专业性与可靠性。未来需扩展知识库覆盖广度与深度，构建动态更新的结构化知识图谱，开发多模态数据支持系统，以使循证护理知识问答智能体持续适应循证护理实践的发展。

关键词: 人工智能, 大语言模型, 智能体, 循证护理, 知识库

Abstract:

Objective To develop an intelligent question-answering（Q&A） agent for evidence-based nursing and systematically compare the performance of knowledge base and multi-model collaborative-driven agent with that of general large language models（LLMs）. Methods An intelligent Q&A system was developed using the low-code Coze platform. A targeted knowledge base was constructed by integrating content from the Evidence-Based Nursing textbook. Retrieval-Augmented Generation（RAG） technology was employed for knowledge retrieval. A collaborative workflow was designed involving two base models，one expert model，and one analytical model. Final outputs were generated using decision rules based on “hard voting” and “expert model priority”. Evaluation was conducted using a standard in-class test from the Fudan University Evidence-Based Nursing Center to compare performance differences among different models（the intelligent Q&A agent for evidence-based nursing，DeepSeek，Kimi，ChatGPT-4o Mini） across overall questions，questions from various chapter categories，and questions with varying difficulty levels（1-5）. Results In terms of overall answer accuracy，the intelligent Q&A agent for evidence-based nursing performed better than the three mainstream models（P<0.05）. In terms of questions across different difficulty levels，the intelligent Q&A agent for evidence-based nursing showed superior performance than the three mainstream models on level-3 questions（P<0.05）. Conclusion The constructed intelligent Q&A agent for evidence-based nursing demonstrates professionalism and reliability in evidence-based nursing. Future efforts should focus on expanding the breadth and depth of the knowledge base，building a dynamically updated structured knowledge graph，and developing a multimodal data support system to continuously make the intelligent Q&A agent for evidence-based nursing adapt to the evolution of evidence-based nursing practice.

Key words: Artificial Intelligence, Large Language Models, Agent, Evidence-based nursing, Knowledge base

师曼飞, 钱玉航, 杨淑琪, 黄宗安, 王佳清, 邢唯杰, 周英凤, 胡雁, 朱政. 知识库与多模型协同驱动的循证护理知识问答智能体的构建与评价研究[J]. 中华护理教育, 2025, 22(9): 1036-1042.

SHI Manfei, QIAN Yuhang, YANG Shuqi, HUANG Zongan, WANG Jiaqing, XING Weijie, ZHOU Yingfeng, HU Yan, ZHU Zheng. Construction and evaluation of an evidence-based nursing knowledge question-answering agent driven by knowledge base and multi-model collaboration[J]. Chinese Journal of Nursing Education, 2025, 22(9): 1036-1042.

图/表 6

图1

图2

图3

图4

表1

表2

参考文献 22

[1]	胡雁, 周英凤. 循证护理实践从入门到进阶[M]. 上海: 复旦大学出版社,2024:3-4.
[2]	李彩霞, 邢唯杰, 李铮, 等. 本科护理学专业学生循证护理教学现状与思考[J]. 中华医学教育杂志, 2018, 38(2):197-202.
	Li CX, Xing WJ, Li Z, et al. Status quo and thinking of evidence-based nursing education in undergraduate nursing students[J]. Chin J Med Educ, 2018, 38(2):197-202.
[3]	李琪, 马俊伟, 赵义妹, 等. 循证护理理念在我国护理学专业教育领域应用现状的可视化分析与展望[J]. 中华医学教育杂志, 2021, 41(2):109-112.
	Li Q, Ma JW, Zhao YM, et al. Visual analysis and prospect of the application of evidence-based nursing concept in nursing education in China[J]. Chin J Med Educ, 2021, 41(2):109-112.
[4]	Bedi S, Liu YT, Orr-Ewing L, et al. Testing and evaluation of health care applications of large language models:a systematic review[J]. JAMA, 2025, 333(4):319-328.
[5]	Dash D, Thapa R, Banda J, et al. Evaluation of GPT-3.5 and GPT-4 for supporting real-world information needs in healthcare delivery[EB/OL]. [2025-03-31]. https://arxiv.org/abs/2304.13714.
[6]	Woodnutt S, Allen C, Snowden J, et al. Could artificial intelligence write mental health nursing care plans?[J]. J Psychiatr Ment Health Nurs, 2024, 31(1):79-86.
[7]	汤志杰, 孙国珍, 李芸霞, 等. 大语言模型在护理领域应用的机遇与挑战[J]. 中国护理管理, 2024, 24(6):929-933.
	Tang ZJ, Sun GZ, Li YX, et al. Opportunities and challenges of applying large language model in nursing field[J]. Chin Nurs Manag, 2024, 24(6):929-933.
[8]	杨淑琪, 朱政, 王佳清, 等. 基于证据的医疗生成式人工智能大语言模型的构建和应用进展[J]. 护士进修杂志, 2025, 40(12):1317-1325.
	Yang SQ, Zhu Z, Wang JQ, et al. Construction and application progress of evidence-based medical generative AI large language models[J]. J Nurses Train, 2025, 40(12):1317-132
[9]	王绍源, 杨东航, 任宇东. 大语言模型在护理领域的应用场景与伦理探讨[J]. 护理学杂志, 2025, 40(5):108-113.
	Wang SY, Yang DH, Ren YD. Application and ethical considerations of large language models in nursing[J]. J Nurs Sci, 2025, 40(5):108-113.
[10]	Ji ZW, Lee N, Frieske R, et al. Survey of hallucination in natural language generation[J]. ACM Comput Surv, 2023, 55(12):1-38.
[11]	Rawte V, Sheth A, Das A. A survey of hallucination in large foundation models[EB/OL]. [2025-03-31]. https://arxiv.org/abs/2309.05922.
[12]	Manakul P, Liusie A, Gales M. SelfCheckGPT:zero-resource black-box hallucination detection for generative large language models[EB/OL]. [2025-03-31]. https://arxiv.org/abs/2303.08896.
[13]	Yang S, Jing M, Wang S, et al. Exploring large language models in healthcare:insights into corpora sources,customization strategies,and evaluation metrics[EB/OL]. [2025-03-31]. https://arxiv.org/abs/2502.11861.
[14]	胡雁, 郝玉芳. 循证护理学[M]. 2版. 北京: 人民卫生出版社,2018:1-511.
[15]	Lewis P, Perez E, Piktus A, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks[EB/OL]. [2025-03-31]. https://arxiv.org/abs/2005.11401.
[16]	张鹤译, 王鑫, 韩立帆, 等. 大语言模型融合知识图谱的问答系统研究[J]. 计算机科学与探索, 2023, 17(10):2377-2388. doi: 10.3778/j.issn.1673-9418.2308070
	Zhang HY, Wang X, Han LF, et al. Research on question answering system on joint of knowledge graph and large language models[J]. J Front Comput Sci Technol, 2023, 17(10):2377-2388. doi: 10.3778/j.issn.1673-9418.2308070
[17]	Peng RL, Liu K, Yang P, et al. Embedding-based retrieval with LLM for effective agriculture information extracting from unstructured data[EB/OL]. [2025-03-31]. https://arxiv.org/abs/2308.03107.
[18]	Acosta JN, Falcone GJ, Rajpurkar P, et al. Multimodal biomedical AI[J]. Nat Med, 2022, 28(9):1773-1784. doi: 10.1038/s41591-022-01981-2 pmid: 36109635
[19]	Lipkova J, Chen RJ, Chen B, et al. Artificial intelligence for multimodal data integration in oncology[J]. Cancer Cell, 2022, 40(10):1095-1110. doi: 10.1016/j.ccell.2022.09.012 pmid: 36220072
[20]	李钥, 淮盼盼, 杨辉. ChatGPT在护理教育中的应用状况及优劣分析[J]. 护理学杂志, 2023, 38(21):117-121.
	Li Y, Huai PP, Yang H. Application of ChatGPT in nursing education and its advantages and disadvantages:a review[J]. J Nurs Sci, 2023, 38(21):117-121.
[21]	Karabacak M, Ozkara BB, Margetis K, et al. The advent of generative language models in medical education[J]. JMIR Med Educ, 2023,9:e48163.
[22]	Lee H. The rise of ChatGPT:exploring its potential in medical education[J]. Anat Sci Educ, 2024, 17(9):926-931.