中国组织工程研究 ›› 2024, Vol. 28 ›› Issue (16): 2550-2554.doi: 10.12307/2024.294

• 组织构建相关数据分析 Date analysis of organization construction • 上一篇    下一篇

基于人工神经网络骨关节炎诊断模型的建立与分析

范以东1,秦  刚2,苏国威1,肖世富1,刘俊良1,李威材1,吴广涛1   

  1. 1广西中医药大学,广西壮族自治区南宁市  530000;2广西中医药大学第一附属医院骨病创伤骨科,广西壮族自治区南宁市  530022
  • 收稿日期:2023-02-28 接受日期:2023-05-08 出版日期:2024-06-08 发布日期:2023-07-31
  • 通讯作者: 秦刚,博士,主任医师,广西中医药大学第一附属医院骨病创伤骨科,广西壮族自治区南宁市 530022
  • 作者简介:范以东,男,1993年生,河南省辉县市人,汉族,广西中医药大学在读硕士,主要从事骨关节退变与缺血性疾病的防治研究。
  • 基金资助:
    广西自然科学基金(2020JJA140375),项目负责人:秦刚;广西中医药大学校级科研创新项目(YCSY2022028),项目负责人:范以东

Establishment and analysis of osteoarthritis diagnosis model based on artificial neural networks

Fan Yidong1, Qin Gang2, Su Guowei1, Xiao Shifu1, Liu Junliang1, Li Weicai1, Wu Guangtao1   

  1. 1Guangxi University of Chinese Medicine, Nanning 530000, Guangxi Zhuang Autonomous Region, China; 2Osteoarthropathy, Traumatic Orthopedics, and Femoral Head Necrosis Specialty, First Affiliated Hospital of Guangxi University of Chinese Medicine, Nanning 530022, Guangxi Zhuang Autonomous Region, China
  • Received:2023-02-28 Accepted:2023-05-08 Online:2024-06-08 Published:2023-07-31
  • Contact: Qin Gang, MD, Chief physician, Osteoarthropathy, Traumatic Orthopedics, and Femoral Head Necrosis Specialty, First Affiliated Hospital of Guangxi University of Chinese Medicine, Nanning 530022, Guangxi Zhuang Autonomous Region, China
  • About author:Fan Yidong, Master candidate, Guangxi University of Chinese Medicine, Nanning 530000, Guangxi Zhuang Autonomous Region, China
  • Supported by:
    the Natural Science Foundation of Guangxi Zhuang Autonomous Region, No. 2020JJA140375 (to QG); Scientific Research Innovation Project of Guangxi University of Chinese Medicine, No. YCSY2022028 (to FYD)

摘要:


文题释义:

人工神经网络:是基于生物学中神经网络的基本原理,以网络拓扑知识为理论基础,模拟人脑神经系统对复杂信息的处理机制的一种数学模型。
机器学习:是一门专注于计算机如何从数据中学习的科学学科,位于统计学和计算机科学的交叉点,可以应用于临床数据集,以开发强大的风险模型并重新定义患者类别。


背景:生物信息学领域的快速发展为骨关节炎的诊断提供了新的方法。人工神经网络具有强大的数据计算能力和分类能力,研究表明其在诊断疾病方面表现出了较好的性能。

目的:旨在使用人工神经网络建立一种新的骨关节炎诊断预测模型,并通过外部数据集验证其在骨关节炎中的诊断价值。
方法:通过GEO数据库搜索下载符合要求的骨关节炎相关数据集,将其分为Train组与Test组。对Train组的基因表达矩阵进行差异分析,筛选差异表达基因。对差异基因进行GO、KEGG富集分析;通过Lasso回归模型、支持向量机模型、随机森林树模型从差异基因中进一步鉴定骨关节炎关键基因。随后使用R软件“neuralnet”包构建基于人工神经网络的骨关节炎诊断模型,5折交叉验证评估模型表现;使用Test组中2个独立的数据集来验证其诊断效果。

结果与结论:通过差异分析共得到90个与骨关节炎相关的差异基因,其中33个下调,57个上调。GO富集分析显示其生物过程主要参与白细胞介导的免疫、骨髓白细胞迁移、趋化因子产生等过程;KEGG富集显示差异基因主要富集在类风湿关节炎、白细胞介素17信号通路、破骨细胞分化等通路。通过3种机器学习方法筛选并取交集后共得到5个诊断骨关节炎的关键基因,分别为HMGB2、GADD45A、SLC19A2、TPPP3、FOLR2。Train组中5个关键基因建立的人工神经网络模型显示其准确率为96.36%,曲线下面积(AUC)为0.997,对神经网络模型的5折交叉验证表明其平均AUC > 0.9,具有一定的稳健性;Test组中2个独立的数据集结果显示其AUC分别为0.814和0.788。提示:建立的人工神经网络骨关节炎诊断模型具有一定的诊断价值。

https://orcid.org/0009-0000-3044-7311(范以东)

中国组织工程研究杂志出版内容重点:组织构建;骨细胞;软骨细胞;细胞培养;成纤维细胞;血管内皮细胞;骨质疏松;组织工程

关键词: 骨关节炎, 诊断模型, 人工神经网络, 机器学习

Abstract: BACKGROUND: Rapid developments in the field of bioinformatics have provided new methods for the diagnosis of osteoarthritis. Artificial neural networks have powerful data computing and classification capabilities, which have shown better performance in disease diagnosis.
OBJECTIVE: To establish a new diagnostic predictive model of osteoarthritis based on artificial neural network and to verify the diagnostic value of the model in osteoarthritis with an external dataset. 
METHODS: The eligible osteoarthritis-related data sets were downloaded through GEO database search and divided into Train group and Test group. The gene expression matrix of the Train group was analyzed to screen the differentially expressed genes. GO and KEGG enrichment analyses were performed on the differentially expressed genes. Through Lasso regression model, support vector machine model and random forest tree model, the key genes of osteoarthritis were further identified from the differentially expressed genes. The R software “Neuralnet” package was then used to construct the osteoarthritis diagnosis model based on artificial neural network, and the model performance was evaluated by the five-fold cross-validation. Two independent data sets in the Test group were used to verify their diagnostic results. 
RESULTS AND CONCLUSION: A total of 90 differentially expressed genes related to osteoarthritis were obtained by differential analysis, of which 33 were down-regulated and 57 were up-regulated. GO enrichment analysis showed that the differentially expressed genes were mainly involved in the following biological processes, including leukocyte-mediated immunity, leukocyte migration in bone marrow and chemokine production. KEGG enrichment analysis showed that these genes were mainly enriched in rheumatoid arthritis, interleukin-17 signaling pathway and osteoclast differentiation pathway. Five key genes for the diagnosis of osteoarthritis, HMGB2, GADD45A, SLC19A2, TPPP3 and FOLR2, were identified by three machine learning methods. The artificial neural network model of five key genes in the Train group showed that the accuracy was 96.36% and the area under the curve was 0.997. The five-fold cross validation of the neural network model showed that the average area under the curve was greater than 0.9 and the model was of robustness. Two independent data sets in the Test group showed its area under the curve was 0.814 and 0.788 respectively. Therefore, the establishment of an artificial neural network model for the diagnosis of osteoarthritis has a certain diagnostic value. 

Key words: osteoarthritis, diagnostic model, artificial neural network, machine learning

中图分类号: