| --- |
| license: apache-2.0 |
| library_name: world_engine |
| tags: |
| - world-model |
| - interactive-video |
| - generative-worlds |
| - real-time |
| - consumer-gpu |
| - diffusion |
| - transformer |
| --- |
| |
| <video src="https://huggingface.co/Overworld/Waypoint-1.5-1B/resolve/main/assets/wp_1.5.mp4" controls autoplay loop muted playsinline width="100%"></video> |
|
|
| # Waypoint-1.5-1B |
|
|
| Waypoint-1.5-1B is the smallest dense model in Overworld’s Waypoint-1.5 family of real-time interactive video world models. Waypoint-1.5 is designed around local, real-time generation on consumer hardware ranging from the most advanced RTX 50 series cards, to older RTX 30 series cards. |
|
|
| ## Model Details |
|
|
| - **Developed by:** Overworld |
| - **Model type:** Real-time interactive video world model |
| - **Model family:** [Waypoint-1.5](https://huggingface.co/collections/Overworld/waypoint-15) |
| - **Parameter count:** 1.2B |
| - **Context length / frame context:** 512 frames |
| - **Input modalities:** Starting image or video conditioning, keyboard / mouse inputs |
| - **Output:** Interactive generated video frames / world rollout |
| - **License:** Apache 2 |
| - **Paper:** Coming soon |
| - **Streaming Demo:** [Overworld Stream](https://www.overworld.stream/) |
| - **Desktop Client:** [Biome](https://over.world/install) |
| - **Core Inference Library:** [Overworldai/world_engine](https://github.com/Wayfarer-Labs/world_engine) |
|
|
| ## Model Summary |
|
|
| Waypoint-1.5 is Overworld’s next-generation real-time video world model release. It builds on the original Waypoint-1 release by improving visual fidelity, expanding the range of consumer hardware that can run the model, and pushing further toward responsive, interactive world simulation without datacenter-scale compute. |
|
|
| At the family level, Waypoint-1.5 targets real-time generation at up to **720p and 60 FPS**, and introduces **two model tiers**: a **720p** model for higher-performance systems and a [**360p** model](https://huggingface.co/Overworld/Waypoint-1.5-1B-360P) intended to run smoothly across a broader range of gaming PCs and Apple Silicon Macs. The release was also trained on **substantially more data than Waypoint-1**, improving coherence and motion consistency over longer interactions. |
|
|
| ## What makes Waypoint-1.5 different |
|
|
| Waypoint-1.5 is built around a simple product constraint: generative worlds should be usable as **interactive systems**, not just watched as offline demos. |
|
|
| Compared with a conventional video generation workflow, the Waypoint family is designed for: |
|
|
| - **Real-time interaction** rather than offline batch generation |
| - **Low-latency responsiveness** to user inputs |
| - **Local execution** on consumer hardware |
| - **Persistent world rollouts** where coherence across time matters as much as single-frame fidelity |
|
|
| In practice, this means the model is intended to be used inside an interactive runtime that can condition generation on previous frames, and live control inputs. |
|
|
| ## Intended Use |
|
|
| This model is intended for: |
|
|
| - Research on real-time world models and interactive video generation |
| - Prototyping AI-native game and simulation experiences |
| - Creative tools for interactive environments, world exploration, and live generative scenes |
| - Experimentation with low-latency generative systems on local hardware |
| - Education and research into control-conditioned video generation |
|
|
| ## Out-of-Scope Use |
|
|
| This model is **not** intended for: |
|
|
| - Generating illegal content or content that exploits, sexualizes, or endangers minors |
| - Generating non-consensual sexual content or explicit sexual content where prohibited |
| - Impersonation, harassment, or deceptive identity-based content |
| - Generating copyrighted characters, branded IP, or celebrity likenesses in ways that infringe rights or violate platform rules |
| - Safety-critical decision-making, surveillance, or high-stakes automated systems |
| - Any deployment that removes reasonable safeguards while serving end users at scale |
|
|
| ## Usage |
|
|
| This checkpoint is intended to be used with Overworld’s interactive runtime stack. |
|
|
| - Play on our official desktop client, [Biome](https://over.world/install) |
| - Use our [world_engine](https://github.com/Wayfarer-Labs/world_engine) inference library to build your own applications |
|
|
|
|
| ### Recommended setup |
|
|
| - **Recommended GPU / device:** RTX 5090 |
| - **Expected FPS on reference hardware:** 56 FPS |
| - **Supported GPUs:** Desktop RTX 30 Series and later. For weaker hardware, you may run [Overworld/Waypoint-1.5-1B-360P](https://huggingface.co/Overworld/Waypoint-1.5-1B-360P) |
|
|
|
|
| ### Architecture |
|
|
| - **Backbone:** Autoregressive Diffusion Transformer |
| - **Autoencoder:** [Tiny Hunyuan Autoencoder (taehv1_5)](https://github.com/madebyollin/taehv) — 4x temporal compression, 8x spatial compression, 32 latent channels |
|
|
| ### Training Data |
|
|
| Waypoint-1.5 was trained on **nearly 100× more data than Waypoint-1**, with the release emphasizing better coherence, motion consistency, and broader hardware accessibility. |
|
|
| ## Performance |
|
|
| ### Waypoint 1 vs Waypoint 1.5 |
|
|
| | | Waypoint 1 | Waypoint 1.5 | |
| |---|---|---| |
| | Resolution | 360P | 720P | |
| | Context window | 2 seconds | 10 seconds | |
| | 4-step unquantized (5090) | 20 FPS | 56 FPS | |
| | 4-step w8a8 quantized (5090) | N/A | 72 FPS | |
| | 4-step w8a8 (3090) | N/A | 30 FPS | |
|
|
|  |
|
|
| ## Limitations |
|
|
| This model has important limitations. |
|
|
| - It is a generative world model, not a simulator with guaranteed physical accuracy. |
| - Long interactive rollouts may drift, collapse, or become inconsistent. |
| - The model may produce unstable geometry, object persistence failures, or implausible motion. |
| - Performance is hardware-dependent and may vary significantly by runtime stack and settings. |
| - Safety mitigations available in hosted deployments may not transfer fully to raw checkpoint use. |
| - Outputs may reflect biases, omissions, or unsafe patterns present in training data or learned world priors. |
|
|
| ## Safety |
|
|
| Please see our blog post, ["Engineering Safety for Interactive World Models"](https://over.world/blog/engineering-safety-for-interactive-world-models) for details. |
|
|
| ## Contact |
|
|
| - [Website](http://over.world/) |
| - [Discord](https://discord.gg/MEmQa7Wux4) |
| - [X/Twitter](https://x.com/overworld_ai) |