| --- |
| license: mit |
| tags: |
| - world-models |
| - reinforcement-learning |
| - robotics |
| - video-prediction |
| - dreamer |
| --- |
| |
| <div align="center"> |
|
|
| # MMBench2 World Model Checkpoints |
|
|
| <h3 align="center"><a href="https://www.nicklashansen.com/mmbench2">Hallucination in World Models is Predictable and Preventable</a></h3> |
|
|
| [Nicklas Hansen](https://www.nicklashansen.com) · [Xiaolong Wang](https://xiaolonw.github.io) · UC San Diego |
|
|
| [](https://www.nicklashansen.com/mmbench2) |
| [](https://www.nicklashansen.com/mmbench2/#live-demo) |
| [](https://huggingface.co/datasets/nicklashansen/mmbench2) |
| [](https://opensource.org/licenses/MIT) |
|
|
| </div> |
|
|
| --- |
|
|
| The world model follows the architecture and two-stage training recipe of [Dreamer 4](https://arxiv.org/abs/2509.24527), adapted for large-scale multi-task continuous control, and is trained on **MMBench2** — a 427-hour, 210-task dataset for visual world modeling (see the [dataset repository](https://huggingface.co/datasets/nicklashansen/mmbench2)). Each variant is a `(tokenizer.pt, dynamics.pt)` pair at 224×224 resolution: |
|
|
| - **tokenizer** — a causal video tokenizer (50M-parameter encoder + 50M-parameter decoder, projecting to a 64-dim continuous latent). |
| - **dynamics** — a 250M-parameter block-causal Transformer trained on the frozen tokenizer with a shortcut flow-matching objective. |
|
|
| ## Variants |
|
|
| | Variant | Description | |
| |---------|-------------| |
| | `base` | Pretrained world model (200 tasks) | |
| | `coverage_aware` | Coverage-aware finetuned world model (200 tasks) | |
| | `combined` | `coverage_aware` finetuned with all targeted data collection sources (210 tasks) | |
|
|
| ## Repository layout |
|
|
| ``` |
| base/ tokenizer.pt dynamics.pt |
| coverage_aware/ tokenizer.pt dynamics.pt |
| combined/ tokenizer.pt dynamics.pt |
| ``` |
|
|
| ## Usage |
|
|
| Using the accompanying code release: |
|
|
| ``` |
| cd dreamer4 |
| python download_checkpoints.py --variant combined # or: base | coverage_aware | all |
| ./run_interactive.sh combined # launch the interactive interface |
| ``` |
|
|
| `download_checkpoints.py` fetches the `(tokenizer.pt, dynamics.pt)` pair into `./checkpoints/<variant>/`. Alternatively, download directly with the Hugging Face CLI: |
|
|
| ``` |
| hf download nicklashansen/mmbench2-models --include "combined/*" --local-dir ./checkpoints |
| ``` |
|
|
| See the [paper](https://www.nicklashansen.com/mmbench2) and the code release for architecture details, training recipes, and the hallucination detection and mitigation methods. |
|
|
| ## License |
|
|
| Released under the MIT License. |
|
|
| ## Citation |
|
|
| ```bibtex |
| @article{Hansen2026Hallucination, |
| title={Hallucination in World Models is Predictable and Preventable}, |
| author={Nicklas Hansen and Xiaolong Wang}, |
| year={2026}, |
| } |
| ``` |
|
|