metadata
license: cc0-1.0
Jasmine Diffusion Checkpoint
Pretrained diffusion-based world model from the Jasmine codebase.
Trained on the CoinRun dataset for action-conditioned video generation using the diffusion-forcing objective (Chen et al., 2024).
Model Details
- Architecture: ST-DiT (spatio-temporal diffusion transformer)
- Input: 16-frame sequences (64×64) + latent actions
- Training Environment: CoinRun (Cobbe et al., 2020)
- Objective: Diffusion forcing (x-prediction)