Stanford Unveils AI System Predicting 100+ Disease Risks from One Night's Sleep Data

January 6, 2026
Stanford Unveils AI System Predicting 100+ Disease Risks from One Night's Sleep Data
  • A foundation-model approach treats sleep recordings as a language, segmenting nights into five-second intervals to learn cross-signal patterns that forecast disease.

  • Stanford Medicine researchers have developed SleepFM, an artificial intelligence system that can predict risk for more than 100 diseases from a single night of sleep data collected via polysomnography.

  • SleepFM is first proficient on standard sleep analysis tasks at or above current state-of-the-art models, then fine-tuned to forecast long-term disease onset by linking sleep data with up to 25 years of follow-up health records.

  • Compared with existing sleep analysis, SleepFM looks at long-term health risk by tying sleep data to decades of electronic health records across over 1,000 disease categories, finding meaningful patterns in about 130 of them.

  • A novel leave-one-out contrastive learning pretraining objective aligns multimodal representations while tolerating missing or heterogeneous channels during inference.

  • Researchers pursue interpretability, aiming to explain model predictions and potentially integrate wearable data, with signals strongest when multiple modalities are combined rather than any single channel.

  • The model shows robust generalization over time and across external cohorts, with strong performance on cardiovascular outcomes and mortality.

  • Data come from four primary cohorts and SHHS for transfer learning; preprocessing resamples signals to 128 Hz and segments into 5-second tokens feeding a 1D conv + channel-agnostic attention + transformer, with a downstream 2-layer LSTM head.

  • Modality analyses show that combining all data streams yields the best predictive power, though certain stages and modalities offer condition-specific advantages (e.g., REM for certain conditions, ECG for circulatory signals, respiratory signals for metabolic risks).

  • Limitations include selection bias (participants referred for sleep disorders), potential temporal degradation in performance, and challenges in interpreting which sleep features drive predictions; translation to clinical practice requires caution.

  • SleepFM variants indicate pretrained embeddings are the main predictive power source, with SleepFM-LSTM often delivering the strongest performance across conditions.

  • SleepFM is trained on roughly 585,000 hours of PSG data from about 65,000 participants, learning the language of sleep from five-second segments across EEG, ECG, EMG, and respiratory signals.

Summary based on 6 sources


Get a daily email with more AI stories

Sources


New AI model predicts disease risk while you sleep



More Stories