中国组织工程研究 ›› 2026, Vol. 30 ›› Issue (5): 1096-1105.doi: 10.12307/2026.021

• 骨组织构建 bone tissue construction • 上一篇    下一篇

加权基因共表达网络分析结合机器学习筛选及验证骨关节炎生物标记物

张  倩,黄东锋   

  1. 中山大学附属第七医院康复医学科,广东省深圳市   518107
  • 收稿日期:2024-11-28 接受日期:2025-01-09 出版日期:2026-02-18 发布日期:2025-06-20
  • 通讯作者: 黄东锋,硕士,主任医师,教授,博士生导师,中山大学附属第七医院康复医学科,广东省深圳市 518107
  • 作者简介:张倩,女,1989年生,内蒙古自治区包头市人,汉族,中山大学附属第七医院在读博士,主要从事骨关节相关疾病康复的临床和基础研究。
  • 基金资助:
    广东省医学科研基金(A2024023),项目负责人:张倩

Weighted gene co-expression network analysis combined with machine learning to screen and validate biomarkers for osteoarthritis

Zhang Qian, Huang Dongfeng   

  1. Department of Rehabilitation Medicine, the Seventh Affiliated Hospital, Sun Yat-sen University, Shenzhen 518107, Guangdong Province, China 
  • Received:2024-11-28 Accepted:2025-01-09 Online:2026-02-18 Published:2025-06-20
  • Contact: Huang Dongfeng, Master, Chief physician, Professor, Doctoral supervisor, Department of Rehabilitation Medicine, the Seventh Affiliated Hospital, Sun Yat-sen University, Shenzhen 518107, Guangdong Province, China
  • About author:Zhang Qian, MD candidate, Department of Rehabilitation Medicine, the Seventh Affiliated Hospital, Sun Yat-sen University, Shenzhen 518107, Guangdong Province, China
  • Supported by:
    Guangdong Medical Research Fund, No. A2024023 (to ZQ) 

摘要:




文题释义:
骨关节炎:是一种常见的退行性骨关节疾病,以关节软骨的进行性退化与破坏、伴随滑膜炎症反应的发生为主要病理特征,目前尚无根治方法。
脂质代谢:是体内一种重要且复杂的生化反应,涉及生物体内脂肪的消化、吸收、合成、分解等多个过程。这一过程通过酶的催化作用,将脂肪加工成机体所需要的物质,保证正常生理功能的运作,故它与疾病的发生发展息息相关。

背景:脂质代谢异常影响软骨细胞代谢,在骨关节炎的发生和进展中有着重要的作用,但目前其机制尚不明确。
目的:采用加权基因共表达网络分析结合机器学习算法鉴定骨关节炎软骨细胞脂质代谢特征基因,并进行初步验证。
方法:采用加权基因共表达网络分析和微阵列数据的线性模型获得差异共表达基因,结合机器学习方法,最终筛选得到脂质代谢相关的特征基因。通过蛋白质互作网络分析,探究差异共表达基因的蛋白质互作网络关系;采用基因本体论和京都基因与基因组百科全书富集分析探索差异共表达基因所在信号通路;运用免疫相关性分析鉴定特征基因与免疫细胞浸润模式;体外分子实验验证特征基因的mRNA和蛋白表达水平。
结果与结论:①经数据标准化处理和主成分分析、加权基因共表达网络分析和微阵列数据的线性模型获得高/低表达的差异共表达基因123和110个;②运用逻辑回归、随机森林和支持向量机3种机器学习算法筛选得到特征基因37个,最终得到2个脂质代谢相关的特征基因SMPD3和CYP4F3;③蛋白质互作网络分析显示SMPD3和CYP4F3蛋白相互作用均较低;④基因本体论结果显示差异共表达基因主要富集在中性粒细胞脱颗粒、中性粒细胞免疫反应和应答、中性粒细胞激活和白细胞脱颗粒等;而京都基因与基因组百科全书富集分析提示差异共表达基因主要涉及细胞外基质受体的作用和黏附等关键通路;⑤基于基因表达数据的细胞类型亚型鉴定分析显示8种免疫细胞在骨关节炎中具有显著差异;相关性分析显示SMPD3与静息态树突状细胞显著正相关(r=0.44,P=3.6×10-3),与中性粒细胞显著负相关(r=-0.48,P=1.7×10-3);而CYP4F3与单核细胞和中性粒细胞显著正相关(r=0.76,P=7.6×10-9;r=0.73,P=6.0×10-8),与T细胞滤泡辅助细胞和静息态树突状细胞显著负相关(r=-0.38,P=0.01;r=-0.38,P=0.01);⑥体外分子实验证明,在骨关节炎组SMPD3 mRNA和蛋白水平显著增高,而CYP4F3降低;⑦结果显示,骨关节炎软骨细胞脂质代谢特征基因SMPD3和CYP4F3可作为骨关节炎靶向治疗及软骨修复或退变的潜在生物标记物,为深入探究国人群体中脂质代谢异常与骨关节炎的关系及临床靶向治疗提供新策略。

https://orcid.org/0000-0002-7958-4082 (黄东锋) 


中国组织工程研究杂志出版内容重点:干细胞;骨髓干细胞;造血干细胞;脂肪干细胞;肿瘤干细胞;胚胎干细胞;脐带脐血干细胞;干细胞诱导;干细胞分化;组织工程

关键词:

骨关节炎, 脂质代谢, 加权基因共表达网络分析, 机器学习, 靶向治疗, 软骨修复或退变

Abstract: BACKGROUND: Dyslipidemia affects chondrocyte metabolism and plays an important role in the occurrence and progression of osteoarthritis, yet its underlying mechanisms remain unclear.
OBJECTIVE: To identify characteristic genes related to lipid metabolism in chondrocytes of osteoarthritis through the weighted gene co-expression network analysis combined with machine learning algorithms and to conduct the preliminary experimental validation.
METHODS: Based on the weighted gene co-expression network analysis and linear models for microarray data, differentially co-expressed genes were identified. Machine learning methods were used for screening characteristic genes related to lipid metabolism. Protein-protein interaction analysis was conducted to explore the interaction network and Gene Ontology and Kyoto Encyclopedia of Genes and Genomes enrichment analyses were employed to investigate the signaling pathways involved in the differentially co-expressed genes. Subsequently, immune correlation analysis was utilized to identify the associations between the characteristic genes and immune infiltration patterns. Furthermore, in vitro molecular experiments were performed to validate the mRNA and protein levels of the characteristic genes. 
RESULTS AND CONCLUSION: (1) A total of 123 high-expression and 110 low-expression differentially co-expressed genes were identified after data normalization and application of principal component analysis, weighted gene co-expression network analysis, and linear models for microarray data. (2) Thirty-seven characteristic genes were screened using logistic regression, random forest, and support vector machine, among which SMPD3 and CYP4F3 were identified as two characteristic genes related to lipid metabolism. (3) Protein-protein interaction analysis revealed that SMPD3 and CYP4F3 proteins showed relatively low levels of interaction. (4) Gene Ontology analysis indicated that these genes were primarily enriched in multiple biological processes, including neutrophils and their immune responses, immune system processes, and neutrophil activation. Kyoto Encyclopedia of Genes and Genomes enrichment analysis suggested that they were mainly involved in key pathways such as extracellular matrix-receptor interaction and cell adhesion. (5) Cell type Identification by estimating relative subsets of RNA transcripts immune infiltration analysis showed significant differences in eight types of immune cells in osteoarthritis. Further correlation analysis revealed that SMPD3 was significantly positively correlated with resting dendritic cells (r=0.44, P=3.6×10-3) and significantly negatively correlated with neutrophils (r=-0.48, P=1.7×10-3), whereas CYP4F3 was significantly positively correlated with monocytes and neutrophils (r=0.76, P=7.6×10-9; r=0.73, P=6.0×10-8), and significantly negatively correlated with T follicular helper cells and resting dendritic cells (r=-0.38, P=0.01; r=-0.38, P=0.01). (6) In vitro molecular experiments demonstrated that SMPD3 mRNA and protein levels were significantly increased in the osteoarthritis group, while CYP4F3 mRNA and protein levels were decreased. To conclude, the lipid metabolism-related SMPD3 and CYP4F3 of osteoarthritis can serve as potential biomarkers for the targeted therapy of osteoarthritis and for assessing cartilage repair or degeneration, which provides a new strategy for exploring the relationship between dyslipidemia and osteoarthritis and for clinical targeted treatment in Chinese population.

Key words: osteoarthritis, lipid metabolism, weighted gene co-expression network analysis, machine learning, targeted therapy, cartilage repair or degeneration

中图分类号: