An interpretable combinatorial data-mining framework for predicting new-onset hypertension in the general population

Scritto il 22/06/2026
da Yohei Miyashita

Hypertens Res. 2026 Jun 22. doi: 10.1038/s41440-026-02715-4. Online ahead of print.

ABSTRACT

We previously established an interpretable combinatorial data-mining framework to identify combinations of clinical factors predictive of heart failure. Because hypertension (HT) is a major contributor to heart failure, accurate prediction of new-onset HT is critically important for prevention. To identify combinations of clinical factors predictive of HT onset using a novel limitless-arity multiple-testing procedure (LAMP) and to estimate the probability of developing HT. We analyzed 2,610,286 individuals without HT who underwent annual health check-ups starting in 2005-2015 and were followed for 5 consecutive years without missing data. Using the LAMP method, we systematically identified statistically significant combinations of fewer than four clinical factors associated with HT onset. Among 28,618 subjects used for rule discovery, 4802 combinations predictive of HT onset were identified. The remaining 2,581,668 individuals were classified into one group with no predictive combinations (G0) and 20 groups (G1-G20) according to increasing numbers of predictive combinations. The incidence of HT increased stepwise with the number of predictive combinations, as confirmed by Kaplan-Meier analyses (p < 0.001). Receiver-operating characteristic analysis demonstrated a moderate discriminative performance (area under the curve = 0.69). We identified combinations of routine clinical parameters that predict new-onset HT in the general population. A greater number of matching predictive combinations was associated with a proportionally higher probability of developing HT. This interpretable combinatorial data-mining framework may enable risk stratification for HT and support early preventive strategies.

PMID:42332083 | DOI:10.1038/s41440-026-02715-4