Stanford Unveils AI System Predicting 100+ Disease Risks from One Night's Sleep Data
January 6, 2026
A foundation-model approach treats sleep recordings as a language, segmenting nights into five-second intervals to learn cross-signal patterns that forecast disease.
Stanford Medicine researchers have developed SleepFM, an artificial intelligence system that can predict risk for more than 100 diseases from a single night of sleep data collected via polysomnography.
SleepFM is first proficient on standard sleep analysis tasks at or above current state-of-the-art models, then fine-tuned to forecast long-term disease onset by linking sleep data with up to 25 years of follow-up health records.
Compared with existing sleep analysis, SleepFM looks at long-term health risk by tying sleep data to decades of electronic health records across over 1,000 disease categories, finding meaningful patterns in about 130 of them.
A novel leave-one-out contrastive learning pretraining objective aligns multimodal representations while tolerating missing or heterogeneous channels during inference.
Researchers pursue interpretability, aiming to explain model predictions and potentially integrate wearable data, with signals strongest when multiple modalities are combined rather than any single channel.
The model shows robust generalization over time and across external cohorts, with strong performance on cardiovascular outcomes and mortality.
Data come from four primary cohorts and SHHS for transfer learning; preprocessing resamples signals to 128 Hz and segments into 5-second tokens feeding a 1D conv + channel-agnostic attention + transformer, with a downstream 2-layer LSTM head.
Modality analyses show that combining all data streams yields the best predictive power, though certain stages and modalities offer condition-specific advantages (e.g., REM for certain conditions, ECG for circulatory signals, respiratory signals for metabolic risks).
Limitations include selection bias (participants referred for sleep disorders), potential temporal degradation in performance, and challenges in interpreting which sleep features drive predictions; translation to clinical practice requires caution.
SleepFM variants indicate pretrained embeddings are the main predictive power source, with SleepFM-LSTM often delivering the strongest performance across conditions.
SleepFM is trained on roughly 585,000 hours of PSG data from about 65,000 participants, learning the language of sleep from five-second segments across EEG, ECG, EMG, and respiratory signals.
Summary based on 6 sources
Get a daily email with more AI stories
Sources

Nature • Jan 6, 2026
A multimodal sleep foundation model for disease prediction
Medical Xpress • Jan 6, 2026
New AI model predicts disease risk while you sleep
Study Finds • Jan 6, 2026
Scientists Decode Sleep Patterns to Forecast Your Future Health Risks
News-Medical • Jan 6, 2026
AI model can predict a person's disease risk using sleep data