中国组织工程研究 ›› 2024, Vol. 28 ›› Issue (35): 5591-5597.doi: 10.12307/2024.828

• 组织工程相关大数据分析 Big data analysis in tissue engineering • 上一篇    下一篇

椎间盘退变伴氧化应激关键生物标志物:生物信息学和机器学习算法的识别

兰  垚,陈浏阳,宋文慧   

  1. 山西医科大学第二医院脊柱外科,山西省太原市  030001
  • 收稿日期:2023-12-08 接受日期:2024-01-16 出版日期:2024-12-18 发布日期:2024-03-15
  • 通讯作者: 宋文慧,博士,主任医师,山西医科大学第二医院脊柱外科,山西省太原市 030001
  • 作者简介:兰垚,男,1996年生,山西省吕梁市人,汉族,山西医科大学在读硕士,主要从事脊柱外科方面研究。

Key biomarkers for the diagnosis of intervertebral disc degeneration associated with oxidative stress: identification based on bioinformatics and machine learning

Lan Yao, Chen Liuyang, Song Wenhui   

  1. Department of Orthopaedic Spinal Surgery, The Second Hospital of Shanxi Medical University, Taiyuan 030001, Shanxi Province, China
  • Received:2023-12-08 Accepted:2024-01-16 Online:2024-12-18 Published:2024-03-15
  • Contact: Song Wenhui, MD, Chief physician, Department of Orthopaedic Spinal Surgery, The Second Hospital of Shanxi Medical University, Taiyuan 030001, Shanxi Province, China
  • About author:Lan Yao, Master candidate, Department of Orthopaedic Spinal Surgery, The Second Hospital of Shanxi Medical University, Taiyuan 030001, Shanxi Province, China

摘要:


文题释义:

椎间盘退变:是指椎间盘组织随着年龄增长等因素经历一个逐渐退化的发展过程,严重影响患者的生活水平,随着人口老龄化的不断加剧,其患病率也在上升,已成为严重的公共卫生问题,但目前没有根治方法,因此迫切需要新的方法来发现其发病机制和早期诊断方法显得尤其重要。
氧化应激:是由细胞和组织中氧化剂和抗氧化剂的失衡引起的,导致活性氧的积累、平衡的丧失,造成关键生物分子和细胞的损伤,在病理过程中扮演了重要角色。


背景:氧化应激与椎间盘退变的发生发展息息相关,但其发病机制和有效治疗方法仍不明确。

目的:运用生物信息学及3种机器学习算法识别与椎间盘退变伴氧化应激相关的关键基因及免疫浸润分析,并进行实验验证。
方法:从GEO数据库获得椎间盘退变基因表达谱以及从GeneCards数据库获得氧化应激相关基因,对椎间盘退变数据集进行差异分析及加权基因共表达网络(WGCNA)分析,两者取交集并与氧化应激相关基因取交集得到候选hub基因,对候选hub基因进行GO和KEGG分析;运用机器学习(LASSO回归、SVM-RFE及随机森林)筛选最佳特征基因并进行受试者特征曲线验证,同时行相关免疫浸润分析。收集2023年7-11月就诊于山西医科大学第二医院的颈椎病患者的椎间盘样本作为椎间盘退变组,颈椎脊髓损伤患者的椎间盘样本作为对照组,采用qPCR方法验证特征基因在椎间盘退变组织中的相对表达量。

结果与结论:①经过差异基因分析获取424个差异表达基因,WGCNA分析得到5 087个基因,同时获得氧化应激基因1 399个,进而得到23个候选hub基因;②GO分析结果显示,主要参与细菌防御反应、细菌来源分子反应等生物过程;涉及分泌颗粒腔、细胞质囊泡腔等细胞组成;涉及内肽酶活性和硫化合物结合等分子功能;③KEGG分析结果显示,候选hub基因与中性粒细胞胞外诱捕网形成、肾素-血管紧张素系统通路等信号通路有关;④运用3种机器学习和ROC验证后得到关键基因HSPA6和PKD1;⑤免疫浸润分析显示HSPA6与活化树突状细胞(r=0.88,P < 0.001)、活化CD4+ T细胞(r=-0.72,P < 0.01)等密切相关,同时PKD1与效应型记忆CD8+ T细胞(r=0.55,P < 0.05)、活化树突状细胞(r=-0.56,P < 0.05)等密切相关;⑥qPCR实验结果表明椎间盘退变组中HSPA6基因低于对照组(P < 0.000 1),而PKD1基因高于对照组(P < 0.000 1);⑦结果表明运用生物信息学及机器学习算法证实HSPA6和PKD1可作为椎间盘退变伴氧化应激的生物标志物,可能通过干预HSPA6和PKD1来改善椎间盘退变。

https://orcid.org/0009-0003-5292-2074(兰垚)

中国组织工程研究杂志出版内容重点:组织构建;骨细胞;软骨细胞;细胞培养;成纤维细胞;血管内皮细胞;骨质疏松;组织工程

关键词: 椎间盘退变, 氧化应激, 差异分析, WGCNA分析, LASSO回归, SVM-RFE分析, 随机森林, 特征基因, 生物信息学, 免疫浸润分析

Abstract: BACKGROUND: Oxidative stress is closely associated with the occurrence and progression of intervertebral disc degeneration, but its underlying mechanisms and effective treatment methods remain unclear.
OBJECTIVE: To identify key genes associated with intervertebral disc degeneration accompanied by oxidative stress based on bioinformatics and three machine learning algorithms, as well as to conduct an immune infiltration analysis, followed by experimental validation.
METHODS: Gene expression profiles related to intervertebral disc degeneration were obtained from the GEO database and oxidative stress-related genes obtained from the GeneCards database. Differential analysis and weighted gene co-expression networks analysis were performed on the intervertebral disc degeneration dataset. The intersection of the two analyses and the intersection with the oxidative stress-related genes were taken to obtain candidate hub genes. Gene ontology and Kyoto Encyclopedia of Genes and Genomes analyses on the candidate hub genes were performed. Machine learning algorithms (LASSO regression, SVM-RFE, and random forest) were used to select the optimal feature genes and perform the receiver operator characteristic curve validation. Simultaneously, immune infiltration analysis was conducted. Nucleus pulposus samples from patients with cervical spondylosis who were treated at the Second Hospital of Shanxi Medical University from July to November 2023 were enrolled as the intervertebral disc degeneration group and nucleus pulposus samples from patients with cervical spinal cord injury as the control group. The relative expression of feature genes in the degenerated intervertebral disc was validated using qPCR method.
RESULTS AND CONCLUSION: After differential gene analysis, 424 differentially expressed genes were obtained. Weighted gene co-expression networks analysis yielded 5 087 genes, and 1 399 oxidative stress genes were identified, leading to the identification of 23 candidate hub genes. Gene ontology analysis revealed that these candidate hub genes are primarily involved in bacterial defense response, molecular response to bacteria, and other biological processes. In terms of cellular component, they are associated with secretion granule lumen and cytoplasmic vesicle lumen, among others. As for molecular function, they are related to endopeptidase activity and compound binding, including sulfur compounds. Kyoto Encyclopedia of Genes and Genomes analysis demonstrated that these candidate hub genes are associated with neutrophil extracellular trap formation and the renin-angiotensin system pathway, among other signaling pathways. By applying three machine learning algorithms and conducting the receiver operator characteristic curve validation, two key genes, HSPA6 and PKD1, were determined. Immune infiltration analysis revealed a strong correlation between HSPA6 and activated dendritic cells (r=0.88, P < 0.001) as well as activated CD4+ T cells (r=-0.72, P < 0.01). Similarly, PKD1 showed close associations with effector memory CD8+ T cells (r=0.55, P < 0.05) and activated dendritic cells (r=-0.56, P < 0.05). qPCR experimental results indicated that the expression level of HSPA6 was lower in the intervertebral disc degeneration group compared with the control group (P < 0.000 1), while the expression level of PKD1 was higher in the intervertebral disc degeneration group (P < 0.000 1). These findings suggest that HSPA6 and PKD1 can serve as biomarkers for intervertebral disc degeneration accompanied by oxidative stress. Interventions targeting HSPA6 and PKD1 may hold promise for improving intervertebral disc degeneration.

Key words: intervertebral disc degeneration, oxidative stress, differential analysis, WGCNA analysis, LASSO regression, SVM-RFE analysis,  , random forest,  , feature gene, bioinformatics, immune cell infiltration analysis

中图分类号: