Identification of diagnostic and prognostic phospholipid biomarkers in idiopathic pulmonary fibrosis via machine learning and in vivo validation

Scritto il 13/12/2025
da Liqing Yang

Hum Genomics. 2025 Dec 12. doi: 10.1186/s40246-025-00845-3. Online ahead of print.

ABSTRACT

BACKGROUND: Idiopathic pulmonary fibrosis (IPF) is a type of progressive interstitial lung disease with an unclear cause and generally poor prognosis. Phospholipids have been implicated in IPF, motivating this investigation into phospholipid-associated biomarkers for diagnostic and prognostic use.

MATERIALS AND METHODS: To identify genes with differential expression, the microarray data from multiple datasets were integrated. A weighted gene co-expression network analysis (WGCNA) was conducted to highlight key modules and genes related to phospholipids in IPF. Genes related to phospholipid were sourced from the GeneCards database, and the intersection of genes with differential expression, WGCNA findings, and phospholipid-associated genes yielded the hub genes. Machine learning techniques were then utilized to construct diagnostic and prognostic models, which were subsequently tested across multiple datasets. Key genes were further confirmed through single-cell sequencing and animal model validation.

RESULTS: Analysis of eight datasets uncovered 920 differentially expressed genes. WGCNA identified the turquoise module (1,884 genes), which overlapped with 1,031 phospholipid-related genes, leading to the identification of 50 hub genes. An 8-gene diagnostic model and an 11-gene prognostic model were developed, both exhibiting superior predictive accuracy compared to previously established models. GABARAPL1 and UNC13B, two key genes, were validated via single-cell sequencing, revealing a reduced proportion of fibroblasts expressing these genes in IPF lungs relative to controls. Additionally, both genes were found to be downregulated in IPF mouse models.

CONCLUSION: This research successfully identified potential biomarkers for IPF and developed a prognostic model based on liquid biopsy and a diagnostic model based on lung tissue. Further validation is required to assess their clinical applicability.

PMID:41388329 | DOI:10.1186/s40246-025-00845-3