Am J Hum Genet. 2026 Jun 3:S0002-9297(26)00193-X. doi: 10.1016/j.ajhg.2026.05.006. Online ahead of print.
ABSTRACT
Genome-wide association study (GWAS) summary statistics for training and individual-level cohorts for fine-tuning are essential for constructing predictive polygenic risk score (PRS) models. However, the relatively low representation of admixed populations in both GWAS summary statistics and individual-level datasets hinders the development of PRSs and equitable clinical translation for admixed populations. Prior work indicates that the most informative PRS model for a genetically homogeneous sample varies linearly in an ancestry continuum space. Guided by these observations, we introduce a genetic-distance-assisted PRS combination pipeline for diverse genetic ancestries (DiscoDivas) to interpolate a harmonized PRS for diverse, especially admixed, genetic ancestries. DiscoDivas leverages multiple PRS models fine-tuned within existing samples, which are mostly of single ancestry, and genetic distance. It provides a new approach to generate genetic-ancestry-specific PRSs when a suitably matched individual-level fine-tuning cohort is unavailable or underpowered. DiscoDivas treats genetic ancestry as a continuous variable and does not require shifting across different models when calculating PRSs for different ancestries. We generated PRSs with DiscoDivas and the current conventional method, i.e., fine-tuning multiple GWAS PRSs using the matched or similar genetic-ancestry samples. DiscoDivas generated a harmonized PRS, performing comparable to or better than the conventional approach, with the greatest advantage exhibited in admixed individuals.
PMID:42235505 | DOI:10.1016/j.ajhg.2026.05.006

