metadata
license: cc0-1.0
Jafar Checkpoint
Pretrained MaskGIT-based world model from the Jafar codebase.
Trained on the CoinRun dataset for action-conditioned video generation using the Jafar configuration as mentioned in the paper.
Model Details
- Architecture: ST-Transformer (spatio-temporal transformer)
- Input: 16-frame sequences (64×64) + latent actions
- Training Environment: CoinRun (Cobbe et al., 2020)
- Objective: MaskGIT (Chang et al., 2022)