中国组织工程研究 ›› 2025, Vol. 29 ›› Issue (36): 7909-7920.doi: 10.12307/2025.523

• 干细胞相关大数据分析 Stem cell-related big data analysis • 上一篇    

急性心肌梗死与中性粒细胞相关潜在生物标志物的机器学习分析

杨定燕1,余振球1,杨中愉2   

  1. 1贵州医科大学附属医院高血压科,贵州省贵阳市   550004;2川北医学院,四川省南充市   637100
  • 收稿日期:2024-04-17 接受日期:2024-06-01 出版日期:2025-12-28 发布日期:2025-03-25
  • 通讯作者: 余振球,主任医师,贵州医科大学附属医院高血压科,贵州省贵阳市 550004
  • 作者简介:杨定燕,女,1997年生,贵州省人,苗族,贵州医科大学在读硕士,主要从事心肌梗死相关研究。
  • 基金资助:
    贵州省卫生健康委科学技术基金项目(gzwkjz2023-100),项目负责人:余振球

Machine learning-based analysis of neutrophil-associated potential biomarkers for acute myocardial infarction

Yang Dingyan1, Yu Zhenqiu1, Yang Zhongyu2   

  1. 1Department of Hypertension, Affiliated Hospital of Guizhou Medical University, Guiyang 550004, Guizhou Province, China; 2North Sichuan Medical College, Nanchong 637100, Sichuan Province, China
  • Received:2024-04-17 Accepted:2024-06-01 Online:2025-12-28 Published:2025-03-25
  • Contact: Yu Zhenqiu, Chief physician, Department of Hypertension, Affiliated Hospital of Guizhou Medical University, Guiyang 550004, Guizhou Province, China
  • About author:Yang Dingyan, Master candidate, Department of Hypertension, Affiliated Hospital of Guizhou Medical University, Guiyang 550004, Guizhou Province, China
  • Supported by:
    Science and Technology Fund Project of Guizhou Provincial Health Commission, No. gzwkjz2023-100 (to YZQ)

摘要:

文题释义:

急性心肌梗死:是冠状动脉(指供应心脏自身血液的动脉,分为左、右冠状动脉,行走于心脏表面)完全闭塞,从而导致心肌细胞急性/持续性缺血和缺氧所致的心肌坏死。
血清性生物标志物:是指能被客观测量和评价,反映疾病生理或病理过程,以及对暴露或治疗干预措施产生生物学效应的指标。

摘要
背景:精确的早期诊断和及时的再灌注治疗是挽救急性心肌梗死患者的生命并改善预后的重要前提条件。因此,寻找能够早期诊断急性心肌梗死的理想生物标志物尤为重要。
目的:拟通过生物信息学和机器学习分析急性心肌梗死与中性粒细胞相关的关键基因,以探寻新的生物标志物。
方法:基于GEO数据库和limma包鉴定急性心肌梗死的差异表达基因。使用反卷积算法探究免疫细胞浸润情况,然后结合加权基因共表达网络分析(Weighted gene co-expression network analysis,WGCNA)、蛋白互作网络和机器学习筛选急性心肌梗死与中性粒细胞相关的特征基因,并进行功能富集分析。用ROC曲线评估特征基因对急性心肌梗死的诊断价值。通过STITCH和Herb数据库筛查生物标志物的靶向药物。最后将在2023年3-6月于贵州医科大学附属医院心内科首次诊断为急性心肌梗死的住院患者作为实验组,同期心电图无缺血性改变、冠状动脉造影无狭窄的住院患者作为对照组,收集两组患者外周血,通过RT-qPCR验证基因在人外周血样本中的相对表达量。
结果与结论:①共获得差异表达基因2 349个,免疫浸润分析发现B cells memory,NK cells resting和Neutrophils等免疫细胞评分在疾病和正常组之间存在差异;②使用WGCNA发现ME green和ME turquoise这两个基因模块与中性粒细胞与急性心肌梗死表现出最高的相关性;③与差异表达基因相交后获得24个差异模块基因,功能富集分析发现其与先天免疫反应、细菌的防御反应等多种过程相关;KEGG结果显示其主要与肿瘤坏死因子信号通路有关。④机器学习算法挖掘到的基因取交集后得到的特征基因为S100A12,PTCH1和LOC400499,在GSE48060和GSE66360数据集中的ROC曲线下面积均大于0.7,将其视为潜在的生物标志物。⑤基于STITCH和Herb数据库发现S100A12有11种靶向药物,PTCH1共发现6种靶向药物。⑥RT-qPCR结果显示,与对照组相比,急性心肌梗死患者中S100A12,PTCH1和LOC400499表达具有显著差异性(P < 0.05)。⑦S100A12,PTCH1和LOC400499可能是急性心肌梗死潜在的诊断生物标志物,但是其与急性心肌梗死相关的特异性尚需进一步研究,其中S100A12可能是调控急性心肌梗死的潜在靶点。

关键词: 急性心肌梗死, 中性粒细胞, 生物标志物, 差异分析, WGCNA, 机器学习算法, 免疫浸润分析, S100A12, PTCH1, LOC400499

Abstract: BACKGROUND: Accurate early diagnosis and timely reperfusion therapy are important prerequisites for saving the lives and improving the prognosis of patients with acute myocardial infarction. Therefore, it is important to find ideal biomarkers for early diagnosis of acute myocardial infarction.
OBJECTIVE: To analyze key genes associated with neutrophils by acute myocardial infarction through bioinformatics and machine learning to explore new biomarkers.
METHODS: Differentially expressed genes were identified based on the Gene Expression Omnibus (GEO) database and Limma R package. Deconvolution algorithm was used to explore the immune cells infiltration level. Then, acute myocardial infarction and neutrophils-related biomarkers were screened by weighted gene co-expression network analysis (WGCNA), protein-protein interaction (PPI) networks, machine learning, and functional enrichment analysis. Receiver operating characteristic curve analysis was conducted to assess the diagnostic efficacy of biomarkers for acute myocardial infarction. Targeted drugs for biomarkers were screened through the STITCH and Herb database. Finally, the hospitalized patients who were first diagnosed with acute myocardial infarction in the Department of Cardiology of Affiliated Hospital of Guizhou Medical University from March to June 2023 were used as the experimental group, and the hospitalized patients who had no ischemic changes on electrocardiograms and no stenosis on coronary angiograms during the same period were used as the control group. Peripheral blood of the patients in the two groups was collected. The relative expressions of the genes were verified in the human peripheral blood samples by RT-qPCR.
RESULTS AND CONCLUSION: (1) A total of 2 349 differentially expressed genes were obtained, and immune infiltration analysis revealed differences in immune cell scores such as B cells memory, NK cells resting, and Neutrophils between the disease and normal groups. (2) Using WGCNA, two gene modules, ME green and ME turquoise, were found to exhibit the highest correlation with neutrophil fine with acute myocardial infarction. (3) Twenty-four differential module genes were obtained after intersecting with differentially expressed genes. Functional enrichment analysis revealed that they were associated with a variety of processes such as innate immune response and defense response to bacteria. KEGG results showed that they were mainly associated with the tumor necrosis factor signaling pathway. (4) The genes mined by the machine learning algorithm took the intersection to obtain three genes, namely, S100A12, PTCH1, and LOC400499, all of which were greater than 0.7 by the area under the receiver operating characteristic curve in both the GSE48060 and GSE66360 datasets. They were considered as potential biomarkers. (5) Based on the STITCH and Herb databases, 11 target drugs were found for S100A12 and a total of 6 target drugs were found for PTCH1. (6) RT-qPCR results showed that S100A12, PTCH1, and LOC400499 were significantly differentially expressed in acute myocardial infarction patients compared with controls (P < 0.05). (7) S100A12, PTCH1, and LOC400499 may be potential diagnostic biomarkers for acute myocardial infarction, but their specificity in relation to acute myocardial infarction needs to be further investigated, in which S100A12 may be a potential target for regulating acute myocardial infarction.

Key words: acute myocardial infarction, neutrophil, biomarker, differential analysis, WGCNA, machine learning algorithms, immune infiltration analysis, S100A12, PTCH1, LOC400499

中图分类号: