File size: 6,217 Bytes
cf6247c c3fa4e8 770364c c3fa4e8 8f1b6d8 c3fa4e8 8f1b6d8 c3fa4e8 8f1b6d8 c3fa4e8 6fb90b1 205c539 8f1b6d8 c3fa4e8 8f1b6d8 c3fa4e8 d39ff00 73ad515 c3fa4e8 8f1b6d8 c3fa4e8 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | ---
license: apache-2.0
library_name: world_engine
tags:
- world-model
- interactive-video
- generative-worlds
- real-time
- consumer-gpu
- diffusion
- transformer
---
<video src="https://huggingface.co/Overworld/Waypoint-1.5-1B/resolve/main/assets/wp_1.5.mp4" controls autoplay loop muted playsinline width="100%"></video>
# Waypoint-1.5-1B
Waypoint-1.5-1B is the smallest dense model in Overworld’s Waypoint-1.5 family of real-time interactive video world models. Waypoint-1.5 is designed around local, real-time generation on consumer hardware ranging from the most advanced RTX 50 series cards, to older RTX 30 series cards.
## Model Details
- **Developed by:** Overworld
- **Model type:** Real-time interactive video world model
- **Model family:** [Waypoint-1.5](https://huggingface.co/collections/Overworld/waypoint-15)
- **Parameter count:** 1.2B
- **Context length / frame context:** 512 frames
- **Input modalities:** Starting image or video conditioning, keyboard / mouse inputs
- **Output:** Interactive generated video frames / world rollout
- **License:** Apache 2
- **Paper:** Coming soon
- **Streaming Demo:** [Overworld Stream](https://www.overworld.stream/)
- **Desktop Client:** [Biome](https://over.world/install)
- **Core Inference Library:** [Overworldai/world_engine](https://github.com/Wayfarer-Labs/world_engine)
## Model Summary
Waypoint-1.5 is Overworld’s next-generation real-time video world model release. It builds on the original Waypoint-1 release by improving visual fidelity, expanding the range of consumer hardware that can run the model, and pushing further toward responsive, interactive world simulation without datacenter-scale compute.
At the family level, Waypoint-1.5 targets real-time generation at up to **720p and 60 FPS**, and introduces **two model tiers**: a **720p** model for higher-performance systems and a [**360p** model](https://huggingface.co/Overworld/Waypoint-1.5-1B-360P) intended to run smoothly across a broader range of gaming PCs and Apple Silicon Macs. The release was also trained on **substantially more data than Waypoint-1**, improving coherence and motion consistency over longer interactions.
## What makes Waypoint-1.5 different
Waypoint-1.5 is built around a simple product constraint: generative worlds should be usable as **interactive systems**, not just watched as offline demos.
Compared with a conventional video generation workflow, the Waypoint family is designed for:
- **Real-time interaction** rather than offline batch generation
- **Low-latency responsiveness** to user inputs
- **Local execution** on consumer hardware
- **Persistent world rollouts** where coherence across time matters as much as single-frame fidelity
In practice, this means the model is intended to be used inside an interactive runtime that can condition generation on previous frames, and live control inputs.
## Intended Use
This model is intended for:
- Research on real-time world models and interactive video generation
- Prototyping AI-native game and simulation experiences
- Creative tools for interactive environments, world exploration, and live generative scenes
- Experimentation with low-latency generative systems on local hardware
- Education and research into control-conditioned video generation
## Out-of-Scope Use
This model is **not** intended for:
- Generating illegal content or content that exploits, sexualizes, or endangers minors
- Generating non-consensual sexual content or explicit sexual content where prohibited
- Impersonation, harassment, or deceptive identity-based content
- Generating copyrighted characters, branded IP, or celebrity likenesses in ways that infringe rights or violate platform rules
- Safety-critical decision-making, surveillance, or high-stakes automated systems
- Any deployment that removes reasonable safeguards while serving end users at scale
## Usage
This checkpoint is intended to be used with Overworld’s interactive runtime stack.
- Play on our official desktop client, [Biome](https://over.world/install)
- Use our [world_engine](https://github.com/Wayfarer-Labs/world_engine) inference library to build your own applications
### Recommended setup
- **Recommended GPU / device:** RTX 5090
- **Expected FPS on reference hardware:** 56 FPS
- **Supported GPUs:** Desktop RTX 30 Series and later. For weaker hardware, you may run [Overworld/Waypoint-1.5-1B-360P](https://huggingface.co/Overworld/Waypoint-1.5-1B-360P)
### Architecture
- **Backbone:** Autoregressive Diffusion Transformer
- **Autoencoder:** [Tiny Hunyuan Autoencoder (taehv1_5)](https://github.com/madebyollin/taehv) — 4x temporal compression, 8x spatial compression, 32 latent channels
### Training Data
Waypoint-1.5 was trained on **nearly 100× more data than Waypoint-1**, with the release emphasizing better coherence, motion consistency, and broader hardware accessibility.
## Performance
### Waypoint 1 vs Waypoint 1.5
| | Waypoint 1 | Waypoint 1.5 |
|---|---|---|
| Resolution | 360P | 720P |
| Context window | 2 seconds | 10 seconds |
| 4-step unquantized (5090) | 20 FPS | 56 FPS |
| 4-step w8a8 quantized (5090) | N/A | 72 FPS |
| 4-step w8a8 (3090) | N/A | 30 FPS |

## Limitations
This model has important limitations.
- It is a generative world model, not a simulator with guaranteed physical accuracy.
- Long interactive rollouts may drift, collapse, or become inconsistent.
- The model may produce unstable geometry, object persistence failures, or implausible motion.
- Performance is hardware-dependent and may vary significantly by runtime stack and settings.
- Safety mitigations available in hosted deployments may not transfer fully to raw checkpoint use.
- Outputs may reflect biases, omissions, or unsafe patterns present in training data or learned world priors.
## Safety
Please see our blog post, ["Engineering Safety for Interactive World Models"](https://over.world/blog/engineering-safety-for-interactive-world-models) for details.
## Contact
- [Website](http://over.world/)
- [Discord](https://discord.gg/MEmQa7Wux4)
- [X/Twitter](https://x.com/overworld_ai) |