中国组织工程研究 ›› 2026, Vol. 30 ›› Issue (36): 9604-9612.doi: 10.12307/2026.911

• 组织构建临床实践 clinical practice in tissue construction • 上一篇    下一篇

构建基于机器学习的脑卒中后失语症患者功能性语言沟通能力预测模型及评价

黄韵诗,柴林松,倪静蕾,左  双,林冰冰,黄  佳   

  1. 福建中医药大学康复医学院,福建省福州市  350122
  • 收稿日期:2025-10-29 出版日期:2026-12-28 发布日期:2026-05-26
  • 通讯作者: 黄佳,教授,博士生导师,福建中医药大学康复医学院,福建省福州市 350122
  • 作者简介:黄韵诗,女,2000年生,广东省东莞市人,汉族,福建中医药大学在读硕士,主要从事神经系统疾病康复方面的研究。
  • 基金资助:
    国家自然科学基金项目(82074512),项目负责人:黄佳;福建省科技计划项目社会发展引导性(重点)项目(2023Y0035),项目负责人:黄佳;福建省自然科学基金(杰青项目)(2024J010033),项目负责人:黄佳

Development and evaluation of a prediction model for functional language communication outcomes in post-stroke aphasia patients

Huang Yunshi, Chai Linsong, Ni Jinglei, Zuo Shuang, Lin Bingbing, Huang Jia   

  1. College of Rehabilitation Medicine, Fujian University of Traditional Chinese Medicine, Fuzhou 350122, Fujian Province, China
  • Received:2025-10-29 Online:2026-12-28 Published:2026-05-26
  • Contact: Huang Jia, Professor, Doctoral supervisor, College of Rehabilitation Medicine, Fujian University of Traditional Chinese Medicine, Fuzhou 350122, Fujian Province, China
  • About author:Huang Yunshi, MS candidate, College of Rehabilitation Medicine, Fujian University of Traditional Chinese Medicine, Fuzhou 350122, Fujian Province, China
  • Supported by:
    National Natural Science Foundation of China, No. 82074512 (to HJ); Fujian Province Science and Technology Planning Project - Social Development Guidance (Key) Project, No. 2023Y0035 (to HJ); Fujian Provincial Natural Science Foundation (Outstanding Youth Project), No. 2024J010033 (to HJ)

摘要:



文题释义:
脑卒中后失语症:指因脑卒中(脑血管意外)导致大脑语言功能区损伤而引起的获得性语言障碍,表现为口语表达、听觉理解、阅读或书写能力的部分或全部丧失,语言功能恢复程度存在显著个体差异,精准预测预后对制定个性化康复方案至关重要。
机器学习:作为人工智能的核心分支,指计算机系统通过算法从历史数据中自动学习内在规律和复杂模式,构建具有预测能力的模型(如支持向量机、随机森林、神经网络等),在医学领域特别适用于处理多维度临床特征以预测疾病结局或风险。

背景:大多数脑卒中后失语症患者发病1年后仍存在基础交流能力缺陷,亟需建立精准预后预测工具指导临床康复决策。
目的:构建基于机器学习的脑卒中后失语症患者出院时语言功能预后预测模型,提升预测准确性。
方法:研究数据源自福建中医药大学附属康复医院2022-07-01/2025-07-01年收治的245例脑卒中后失语症患者,以失语商变化量≥6分为结局指标。研究队列按7∶3的比例随机分为训练集(n=171)和测试集(n=74)。通过递归特征消除法筛选预测因子,采用6种机器学习算法(逻辑回归、随机森林、决策树、支持向量机、高斯朴素贝叶斯、极限梯度提升分类器)构建模型,使用自举法进行内部验证,通过受试者工作特征曲线、校准曲线及沙普利可加性特征解释方法(SHAP)分析评价模型效能。
结果与结论:245例脑卒中后失语症患者的语言功能改善率为69.80%。年龄、女性、教育程度、命名性失语、非流畅性失语、完全性失语、基线中国功能性语言沟通能力测评总分等10项因素被选为预测因子。高斯朴素贝叶斯模型在测试集中表现最优:曲线下面积值为0.71,F1分数为0.83。校准曲线显示预测概率与实际发生率一致性良好(Brier得分=0.19),沙普利可加性特征解释方法分析显示命名性失语(0.24)、高龄(0.17)、女性(0.08)、基线中国功能性语言沟通能力测评总分(0.08)、非流畅性失语(0.07)为关键风险因素。结果表明,基于机器学习构建的高斯朴素贝叶斯预测模型可有效识别脑卒中后失语症患者出院时语言功能预后风险,为个体化康复干预提供决策支持。
https://orcid.org/0009-0008-9498-0834(黄韵诗)


中国组织工程研究杂志出版内容重点:干细胞;骨髓干细胞;造血干细胞;脂肪干细胞;肿瘤干细胞;胚胎干细胞;脐带脐血干细胞;干细胞诱导;干细胞分化;组织工程

关键词: 脑卒中后失语症, 机器学习, 预后预测, 高斯朴素贝叶斯, 中国功能性语言沟通能力测评

Abstract: BACKGROUND: Most patients with post-stroke aphasia still have basic communication deficits 1 year after onset, highlighting an urgent need for accurate prognostic prediction tools to guide clinical rehabilitation decisions.
OBJECTIVE: To construct a machine learning-based model for predicting language function prognosis at discharge in patients with post-stroke aphasia, aiming to improve prediction accuracy.
METHODS: Clinical data were collected from 245 patients with post-stroke aphasia admitted to the Rehabilitation Hospital Affiliated to Fujian University of Traditional Chinese Medicine from July 1, 2022 to July 1, 2025, with an aphasia quotient change ≥ 6 points as the outcome indicator. The study cohort was randomly divided into training (n=171) and test (n=74) sets in a 7:3 ratio. Predictive factors were screened using recursive feature elimination. Six machine learning algorithms (logistic regression, random forest, decision tree, support vector machine, Gaussian naïve Bayes, and extreme gradient boosting classifier) were used to construct models. Internal validation was performed using the bootstrap method. Model performance was evaluated using receiver operating characteristic curves, calibration curves, and Shapley additive explanations analysis.
RESULTS AND CONCLUSION: The language function improvement rate among the 245 patients with post-stroke aphasia was 69.80%. Ten factors, including age, female sex, education level, anomic aphasia, non-fluent aphasia, global aphasia, and baseline total score of the Chinese functional communication profile, were selected as predictive factors. The Gaussian naïve Bayes model performed best in the test set, with an area under the curve of 0.71 and an F1 score of 0.83. The calibration curve showed good consistency between predicted probabilities and actual outcomes (Brier score=0.19). Shapley additive explanations analysis identified anomic aphasia (0.24), advanced age (0.17), female sex (0.08), baseline total score of the Chinese Functional Communication Profile (0.08), and non-fluent aphasia (0.07) as key risk factors. These findings indicate that the Gaussian naïve Bayes prediction model based on machine learning can effectively identify language functional prognosis risks at discharge in patients with post-stroke aphasia, providing decision support for individualized rehabilitation interventions.


Key words: post-stroke aphasia, machine learning, prognostic prediction, Gaussian na?ve Bayes, Chinese Functional Communication Profile

中图分类号: