Construction and evaluation of an evidence-based nursing knowledge question-answering agent driven by knowledge base and multi-model collaboration

SHI Manfei; QIAN Yuhang; YANG Shuqi; HUANG Zongan; WANG Jiaqing; XING Weijie; ZHOU Yingfeng; HU Yan; ZHU Zheng

doi:10.3761/j.issn.1672-9234.2025.09.002

2025 , Vol. 22 >Issue 9: 1036 - 1042

DOI: https://doi.org/10.3761/j.issn.1672-9234.2025.09.002

Digitization and Intelligence Development of Nursing Education

Construction and evaluation of an evidence-based nursing knowledge question-answering agent driven by knowledge base and multi-model collaboration

SHI Manfei ,
QIAN Yuhang ,
YANG Shuqi ,
HUANG Zongan ,
WANG Jiaqing ,
XING Weijie ,
ZHOU Yingfeng ,
HU Yan ,
ZHU Zheng

Expand

Received date: 2025-04-03

Online published: 2025-09-19

Fold

Abstract

Objective To develop an intelligent question-answering（Q&A） agent for evidence-based nursing and systematically compare the performance of knowledge base and multi-model collaborative-driven agent with that of general large language models（LLMs）. Methods An intelligent Q&A system was developed using the low-code Coze platform. A targeted knowledge base was constructed by integrating content from the Evidence-Based Nursing textbook. Retrieval-Augmented Generation（RAG） technology was employed for knowledge retrieval. A collaborative workflow was designed involving two base models，one expert model，and one analytical model. Final outputs were generated using decision rules based on “hard voting” and “expert model priority”. Evaluation was conducted using a standard in-class test from the Fudan University Evidence-Based Nursing Center to compare performance differences among different models（the intelligent Q&A agent for evidence-based nursing，DeepSeek，Kimi，ChatGPT-4o Mini） across overall questions，questions from various chapter categories，and questions with varying difficulty levels（1-5）. Results In terms of overall answer accuracy，the intelligent Q&A agent for evidence-based nursing performed better than the three mainstream models（P<0.05）. In terms of questions across different difficulty levels，the intelligent Q&A agent for evidence-based nursing showed superior performance than the three mainstream models on level-3 questions（P<0.05）. Conclusion The constructed intelligent Q&A agent for evidence-based nursing demonstrates professionalism and reliability in evidence-based nursing. Future efforts should focus on expanding the breadth and depth of the knowledge base，building a dynamically updated structured knowledge graph，and developing a multimodal data support system to continuously make the intelligent Q&A agent for evidence-based nursing adapt to the evolution of evidence-based nursing practice.

Key words： Artificial Intelligence; Large Language Models; Agent; Evidence-based nursing; Knowledge base

Cite this article

SHI Manfei , QIAN Yuhang , YANG Shuqi , HUANG Zongan , WANG Jiaqing , XING Weijie , ZHOU Yingfeng , HU Yan , ZHU Zheng . Construction and evaluation of an evidence-based nursing knowledge question-answering agent driven by knowledge base and multi-model collaboration[J]. Chinese Journal of Nursing Education, 2025 , 22(9) : 1036 -1042 . DOI: 10.3761/j.issn.1672-9234.2025.09.002

References

[1]	胡雁, 周英凤. 循证护理实践从入门到进阶[M]. 上海: 复旦大学出版社,2024:3-4.
[2]	李彩霞, 邢唯杰, 李铮, 等. 本科护理学专业学生循证护理教学现状与思考[J]. 中华医学教育杂志, 2018, 38(2):197-202.
	Li CX, Xing WJ, Li Z, et al. Status quo and thinking of evidence-based nursing education in undergraduate nursing students[J]. Chin J Med Educ, 2018, 38(2):197-202.
[3]	李琪, 马俊伟, 赵义妹, 等. 循证护理理念在我国护理学专业教育领域应用现状的可视化分析与展望[J]. 中华医学教育杂志, 2021, 41(2):109-112.
	Li Q, Ma JW, Zhao YM, et al. Visual analysis and prospect of the application of evidence-based nursing concept in nursing education in China[J]. Chin J Med Educ, 2021, 41(2):109-112.
[4]	Bedi S, Liu YT, Orr-Ewing L, et al. Testing and evaluation of health care applications of large language models:a systematic review[J]. JAMA, 2025, 333(4):319-328.
[5]	Dash D, Thapa R, Banda J, et al. Evaluation of GPT-3.5 and GPT-4 for supporting real-world information needs in healthcare delivery[EB/OL]. [2025-03-31]. https://arxiv.org/abs/2304.13714.
[6]	Woodnutt S, Allen C, Snowden J, et al. Could artificial intelligence write mental health nursing care plans?[J]. J Psychiatr Ment Health Nurs, 2024, 31(1):79-86.
[7]	汤志杰, 孙国珍, 李芸霞, 等. 大语言模型在护理领域应用的机遇与挑战[J]. 中国护理管理, 2024, 24(6):929-933.
	Tang ZJ, Sun GZ, Li YX, et al. Opportunities and challenges of applying large language model in nursing field[J]. Chin Nurs Manag, 2024, 24(6):929-933.
[8]	杨淑琪, 朱政, 王佳清, 等. 基于证据的医疗生成式人工智能大语言模型的构建和应用进展[J]. 护士进修杂志, 2025, 40(12):1317-1325.
	Yang SQ, Zhu Z, Wang JQ, et al. Construction and application progress of evidence-based medical generative AI large language models[J]. J Nurses Train, 2025, 40(12):1317-132
[9]	王绍源, 杨东航, 任宇东. 大语言模型在护理领域的应用场景与伦理探讨[J]. 护理学杂志, 2025, 40(5):108-113.
	Wang SY, Yang DH, Ren YD. Application and ethical considerations of large language models in nursing[J]. J Nurs Sci, 2025, 40(5):108-113.
[10]	Ji ZW, Lee N, Frieske R, et al. Survey of hallucination in natural language generation[J]. ACM Comput Surv, 2023, 55(12):1-38.
[11]	Rawte V, Sheth A, Das A. A survey of hallucination in large foundation models[EB/OL]. [2025-03-31]. https://arxiv.org/abs/2309.05922.
[12]	Manakul P, Liusie A, Gales M. SelfCheckGPT:zero-resource black-box hallucination detection for generative large language models[EB/OL]. [2025-03-31]. https://arxiv.org/abs/2303.08896.
[13]	Yang S, Jing M, Wang S, et al. Exploring large language models in healthcare:insights into corpora sources,customization strategies,and evaluation metrics[EB/OL]. [2025-03-31]. https://arxiv.org/abs/2502.11861.
[14]	胡雁, 郝玉芳. 循证护理学[M]. 2版. 北京: 人民卫生出版社,2018:1-511.
[15]	Lewis P, Perez E, Piktus A, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks[EB/OL]. [2025-03-31]. https://arxiv.org/abs/2005.11401.
[16]	张鹤译, 王鑫, 韩立帆, 等. 大语言模型融合知识图谱的问答系统研究[J]. 计算机科学与探索, 2023, 17(10):2377-2388.
	Zhang HY, Wang X, Han LF, et al. Research on question answering system on joint of knowledge graph and large language models[J]. J Front Comput Sci Technol, 2023, 17(10):2377-2388.
[17]	Peng RL, Liu K, Yang P, et al. Embedding-based retrieval with LLM for effective agriculture information extracting from unstructured data[EB/OL]. [2025-03-31]. https://arxiv.org/abs/2308.03107.
[18]	Acosta JN, Falcone GJ, Rajpurkar P, et al. Multimodal biomedical AI[J]. Nat Med, 2022, 28(9):1773-1784.
[19]	Lipkova J, Chen RJ, Chen B, et al. Artificial intelligence for multimodal data integration in oncology[J]. Cancer Cell, 2022, 40(10):1095-1110.
[20]	李钥, 淮盼盼, 杨辉. ChatGPT在护理教育中的应用状况及优劣分析[J]. 护理学杂志, 2023, 38(21):117-121.
	Li Y, Huai PP, Yang H. Application of ChatGPT in nursing education and its advantages and disadvantages:a review[J]. J Nurs Sci, 2023, 38(21):117-121.
[21]	Karabacak M, Ozkara BB, Margetis K, et al. The advent of generative language models in medical education[J]. JMIR Med Educ, 2023,9:e48163.
[22]	Lee H. The rise of ChatGPT:exploring its potential in medical education[J]. Anat Sci Educ, 2024, 17(9):926-931.

Options

Outlines

模态框（Modal）标题

Abstract

Cite this article

References