--- license: apache-2.0 library_name: world_engine tags: - world-model - interactive-video - generative-worlds - real-time - consumer-gpu - diffusion - transformer --- # Waypoint-1.5-1B Waypoint-1.5-1B is the smallest dense model in Overworld’s Waypoint-1.5 family of real-time interactive video world models. Waypoint-1.5 is designed around local, real-time generation on consumer hardware ranging from the most advanced RTX 50 series cards, to older RTX 30 series cards. ## Model Details - **Developed by:** Overworld - **Model type:** Real-time interactive video world model - **Model family:** [Waypoint-1.5](https://huggingface.co/collections/Overworld/waypoint-15) - **Parameter count:** 1.2B - **Context length / frame context:** 512 frames - **Input modalities:** Starting image or video conditioning, keyboard / mouse inputs - **Output:** Interactive generated video frames / world rollout - **License:** Apache 2 - **Paper:** Coming soon - **Streaming Demo:** [Overworld Stream](https://www.overworld.stream/) - **Desktop Client:** [Biome](https://over.world/install) - **Core Inference Library:** [Overworldai/world_engine](https://github.com/Wayfarer-Labs/world_engine) ## Model Summary Waypoint-1.5 is Overworld’s next-generation real-time video world model release. It builds on the original Waypoint-1 release by improving visual fidelity, expanding the range of consumer hardware that can run the model, and pushing further toward responsive, interactive world simulation without datacenter-scale compute. At the family level, Waypoint-1.5 targets real-time generation at up to **720p and 60 FPS**, and introduces **two model tiers**: a **720p** model for higher-performance systems and a [**360p** model](https://huggingface.co/Overworld/Waypoint-1.5-1B-360P) intended to run smoothly across a broader range of gaming PCs and Apple Silicon Macs. The release was also trained on **substantially more data than Waypoint-1**, improving coherence and motion consistency over longer interactions. ## What makes Waypoint-1.5 different Waypoint-1.5 is built around a simple product constraint: generative worlds should be usable as **interactive systems**, not just watched as offline demos. Compared with a conventional video generation workflow, the Waypoint family is designed for: - **Real-time interaction** rather than offline batch generation - **Low-latency responsiveness** to user inputs - **Local execution** on consumer hardware - **Persistent world rollouts** where coherence across time matters as much as single-frame fidelity In practice, this means the model is intended to be used inside an interactive runtime that can condition generation on previous frames, and live control inputs. ## Intended Use This model is intended for: - Research on real-time world models and interactive video generation - Prototyping AI-native game and simulation experiences - Creative tools for interactive environments, world exploration, and live generative scenes - Experimentation with low-latency generative systems on local hardware - Education and research into control-conditioned video generation ## Out-of-Scope Use This model is **not** intended for: - Generating illegal content or content that exploits, sexualizes, or endangers minors - Generating non-consensual sexual content or explicit sexual content where prohibited - Impersonation, harassment, or deceptive identity-based content - Generating copyrighted characters, branded IP, or celebrity likenesses in ways that infringe rights or violate platform rules - Safety-critical decision-making, surveillance, or high-stakes automated systems - Any deployment that removes reasonable safeguards while serving end users at scale ## Usage This checkpoint is intended to be used with Overworld’s interactive runtime stack. - Play on our official desktop client, [Biome](https://over.world/install) - Use our [world_engine](https://github.com/Wayfarer-Labs/world_engine) inference library to build your own applications ### Recommended setup - **Recommended GPU / device:** RTX 5090 - **Expected FPS on reference hardware:** 56 FPS - **Supported GPUs:** Desktop RTX 30 Series and later. For weaker hardware, you may run [Overworld/Waypoint-1.5-1B-360P](https://huggingface.co/Overworld/Waypoint-1.5-1B-360P) ### Architecture - **Backbone:** Autoregressive Diffusion Transformer - **Autoencoder:** [Tiny Hunyuan Autoencoder (taehv1_5)](https://github.com/madebyollin/taehv) — 4x temporal compression, 8x spatial compression, 32 latent channels ### Training Data Waypoint-1.5 was trained on **nearly 100× more data than Waypoint-1**, with the release emphasizing better coherence, motion consistency, and broader hardware accessibility. ## Performance ### Waypoint 1 vs Waypoint 1.5 | | Waypoint 1 | Waypoint 1.5 | |---|---|---| | Resolution | 360P | 720P | | Context window | 2 seconds | 10 seconds | | 4-step unquantized (5090) | 20 FPS | 56 FPS | | 4-step w8a8 quantized (5090) | N/A | 72 FPS | | 4-step w8a8 (3090) | N/A | 30 FPS | ![Generation Throughput — Waypoint 1 vs Waypoint 1.5](assets/perf_chart.png) ## Limitations This model has important limitations. - It is a generative world model, not a simulator with guaranteed physical accuracy. - Long interactive rollouts may drift, collapse, or become inconsistent. - The model may produce unstable geometry, object persistence failures, or implausible motion. - Performance is hardware-dependent and may vary significantly by runtime stack and settings. - Safety mitigations available in hosted deployments may not transfer fully to raw checkpoint use. - Outputs may reflect biases, omissions, or unsafe patterns present in training data or learned world priors. ## Safety Please see our blog post, ["Engineering Safety for Interactive World Models"](https://over.world/blog/engineering-safety-for-interactive-world-models) for details. ## Contact - [Website](http://over.world/) - [Discord](https://discord.gg/MEmQa7Wux4) - [X/Twitter](https://x.com/overworld_ai)