gemeo-twin-stack / src /gemeo /cwm /__init__.py
timmers's picture
GEMEO world-model β€” initial release (module + NeuralSurv ckpt + RareBench v49 + KG embeddings)
089d665 verified
"""GEMEO-CWM: Causal World Model via Block Diffusion + Classifier-Free Guidance.
Target SOTA (May 2026):
- First Block Diffusion + CFG on clinical EHR trajectories
- Rare-disease + PT-BR (no incumbent competing in this niche)
- TTE-validated against >=5 Brazilian PCDT natural experiments
- Full Robins-Hernan sensitivity suite (E-values, negative controls, tipping-point)
- On-device (Apple Silicon) β€” LGPD-compliant, no cloud inference
Beats:
- EHRWorld (arXiv 2602.03569) β€” no rare disease, no real counterfactual
- medDreamer (arXiv 2505.19785) β€” ICU only, no CFG
- TA-G-Transformer (Helsinki) β€” no diffusion, no PT-BR rare cohort
- ICOM (TechRxiv 2601) β€” no released code
- PROCOVA (Unlearn.ai) β€” only AD/ALS/IBD covered
Module layout:
block_diffusion.py β€” model architecture (absorbing-state + block-causal)
train_cwm.py β€” training loop with conditional dropout for CFG
cfg_sample.py β€” classifier-free guided sampling + counterfactual rollouts
tte_validate.py β€” target-trial emulation against PCDT natural experiments
sensitivity.py β€” E-values, negative controls, tipping-point analysis
data.py β€” event-stream loader from DATASUS SIH/APAC/SIM JSONs
"""
from .block_diffusion import BlockDiffusionTransformer, CWMConfig
from .train_cwm import train_cwm
from .cfg_sample import cfg_sample, counterfactual_pair
from .tte_validate import emulate_trial, ate_with_ci
__all__ = [
"BlockDiffusionTransformer", "CWMConfig",
"train_cwm",
"cfg_sample", "counterfactual_pair",
"emulate_trial", "ate_with_ci",
]