Gigascience. 2026 May 22:giag060. doi: 10.1093/gigascience/giag060. Online ahead of print.
ABSTRACT
Many research domains are producing large, multi-scale, multi-modal datasets at growing rates with mixed variable types (continuous, discrete, censored). Identifying possible cause-effect associations in such datasets is essential for predicting outcomes and proposing possible interventions. Probabilistic graphical models (PGMs) have emerged as a robust, interpretable way to analyze such datasets, but current graph learning algorithms cannot incorporate time-to-event (censored) variables, which are important in many systems (e.g., patient survival). Instead, regression models are typically used for survival analysis of single censored variables, but these cannot assess cause-effect interactions. Here, we present a new mathematical framework to incorporate multiple censored variables into mixed graphical models. A novel efficient algorithm, CausalCoxMGM, is implemented, which is extensively evaluated on synthetic and real-life high-dimensional biomedical datasets (cardiovascular disease, breast cancer). CausalCoxMGM was able to recover effectors of censored variables, supported by literature, and provided new mechanistic insights on the differences between ER+ and ER- breast cancers.
PMID:42172040 | DOI:10.1093/gigascience/giag060

