Chinese Journal of Tissue Engineering Research ›› 2026, Vol. 30 ›› Issue (5): 1096-1105.doi: 10.12307/2026.021

Previous Articles     Next Articles

Weighted gene co-expression network analysis combined with machine learning to screen and validate biomarkers for osteoarthritis

Zhang Qian, Huang Dongfeng   

  1. Department of Rehabilitation Medicine, the Seventh Affiliated Hospital, Sun Yat-sen University, Shenzhen 518107, Guangdong Province, China 
  • Received:2024-11-28 Accepted:2025-01-09 Online:2026-02-18 Published:2025-06-20
  • Contact: Huang Dongfeng, Master, Chief physician, Professor, Doctoral supervisor, Department of Rehabilitation Medicine, the Seventh Affiliated Hospital, Sun Yat-sen University, Shenzhen 518107, Guangdong Province, China
  • About author:Zhang Qian, MD candidate, Department of Rehabilitation Medicine, the Seventh Affiliated Hospital, Sun Yat-sen University, Shenzhen 518107, Guangdong Province, China
  • Supported by:
    Guangdong Medical Research Fund, No. A2024023 (to ZQ) 

Abstract: BACKGROUND: Dyslipidemia affects chondrocyte metabolism and plays an important role in the occurrence and progression of osteoarthritis, yet its underlying mechanisms remain unclear.
OBJECTIVE: To identify characteristic genes related to lipid metabolism in chondrocytes of osteoarthritis through the weighted gene co-expression network analysis combined with machine learning algorithms and to conduct the preliminary experimental validation.
METHODS: Based on the weighted gene co-expression network analysis and linear models for microarray data, differentially co-expressed genes were identified. Machine learning methods were used for screening characteristic genes related to lipid metabolism. Protein-protein interaction analysis was conducted to explore the interaction network and Gene Ontology and Kyoto Encyclopedia of Genes and Genomes enrichment analyses were employed to investigate the signaling pathways involved in the differentially co-expressed genes. Subsequently, immune correlation analysis was utilized to identify the associations between the characteristic genes and immune infiltration patterns. Furthermore, in vitro molecular experiments were performed to validate the mRNA and protein levels of the characteristic genes. 
RESULTS AND CONCLUSION: (1) A total of 123 high-expression and 110 low-expression differentially co-expressed genes were identified after data normalization and application of principal component analysis, weighted gene co-expression network analysis, and linear models for microarray data. (2) Thirty-seven characteristic genes were screened using logistic regression, random forest, and support vector machine, among which SMPD3 and CYP4F3 were identified as two characteristic genes related to lipid metabolism. (3) Protein-protein interaction analysis revealed that SMPD3 and CYP4F3 proteins showed relatively low levels of interaction. (4) Gene Ontology analysis indicated that these genes were primarily enriched in multiple biological processes, including neutrophils and their immune responses, immune system processes, and neutrophil activation. Kyoto Encyclopedia of Genes and Genomes enrichment analysis suggested that they were mainly involved in key pathways such as extracellular matrix-receptor interaction and cell adhesion. (5) Cell type Identification by estimating relative subsets of RNA transcripts immune infiltration analysis showed significant differences in eight types of immune cells in osteoarthritis. Further correlation analysis revealed that SMPD3 was significantly positively correlated with resting dendritic cells (r=0.44, P=3.6×10-3) and significantly negatively correlated with neutrophils (r=-0.48, P=1.7×10-3), whereas CYP4F3 was significantly positively correlated with monocytes and neutrophils (r=0.76, P=7.6×10-9; r=0.73, P=6.0×10-8), and significantly negatively correlated with T follicular helper cells and resting dendritic cells (r=-0.38, P=0.01; r=-0.38, P=0.01). (6) In vitro molecular experiments demonstrated that SMPD3 mRNA and protein levels were significantly increased in the osteoarthritis group, while CYP4F3 mRNA and protein levels were decreased. To conclude, the lipid metabolism-related SMPD3 and CYP4F3 of osteoarthritis can serve as potential biomarkers for the targeted therapy of osteoarthritis and for assessing cartilage repair or degeneration, which provides a new strategy for exploring the relationship between dyslipidemia and osteoarthritis and for clinical targeted treatment in Chinese population.

Key words: osteoarthritis, lipid metabolism, weighted gene co-expression network analysis, machine learning, targeted therapy, cartilage repair or degeneration

CLC Number: