NPJ Digit Med. 2026 Feb 17. doi: 10.1038/s41746-026-02438-3. Online ahead of print.
ABSTRACT
Cardiovascular diseases (CVDs) are the leading cause of death worldwide. To interpret disease mechanisms and warn CVDs in early life, biobanks have emerged to collect genotype and electrocardiogram (ECG) data. However, only 10% of samples contain both genotype data and ECG data in UK-Biobank (UKB), limiting the utility of the biobanks. Here, we have developed an attention-based Capsule Network (CapECG), to predict ECG traits from genotype. CapECG has mapped high dimensional genotype to low dimensional ECG traits and improved the CVDs prediction from genotype. CapECG achieved an average Pearson correlation coefficient (PCC) of 0.62 for 7422 individuals in the internal test set from UKB. The model was used to predict 169 ECG traits for 388,284 individuals containing only genotype data in UKB. The predicted 169 ECG traits were used to assess risks of six types of CVDs, and achieved average area under the curve (AUC) of 0.80, higher than 0.71 provided by the polygenic risk score-based method. Genome-wide association study (GWAS) on the predicted spatial QRS-T angle (spQRSTa) identified 133 significant single nucleotide polymorphisms (SNPs), including 33 overlapping with a published GWAS on 118,780 individuals, surpassing 13 overlaps from observed spQRSTa of 29,692 individuals. Thus, this study proposed a new way to predict ECG traits from genotype and bridge the early prediction of diseases.
PMID:41703193 | DOI:10.1038/s41746-026-02438-3

