Waypoint-1.5-1B-360P
Waypoint-1.5-1B-360P is the smallest dense model in Overworld’s Waypoint-1.5 family of real-time interactive video world models, with finetuning for 360P generation applied. Waypoint-1.5-1B-360P is designed around local, real-time generation on Nvidia laptop GPUs, and (soon) Apple Silicon.
Model Details
- Developed by: Overworld
- Model type: Real-time interactive video world model
- Model family: Waypoint-1.5
- Parameter count: 1.2B
- Context length / frame context: 512 frames
- Input modalities: Starting image or video conditioning, keyboard / mouse inputs
- Output: Interactive generated video frames / world rollout
- License: Apache 2
- Paper: Coming soon
- Streaming Demo: Overworld Stream
- Desktop Client: Biome
- Core Inference Library: Overworldai/world_engine
Model Summary
Waypoint-1.5 is Overworld’s next-generation real-time video world model release. It builds on the original Waypoint-1 release by improving visual fidelity, expanding the range of consumer hardware that can run the model, and pushing further toward responsive, interactive world simulation without datacenter-scale compute.
At the family level, Waypoint-1.5 targets real-time generation at up to 720p and 60 FPS, and introduces two model tiers: a 720p model for desktop RTX 30 series through RTX 50 series cards, and this 360P model for laptop GPUs and (soon) Apple Silicon. The release was also trained on substantially more data than Waypoint-1, improving coherence and motion consistency over longer interactions.
What makes Waypoint-1.5 different
Waypoint-1.5 is built around a simple product constraint: generative worlds should be usable as interactive systems, not just watched as offline demos.
Compared with a conventional video generation workflow, the Waypoint family is designed for:
- Real-time interaction rather than offline batch generation
- Low-latency responsiveness to user inputs
- Local execution on consumer hardware
- Persistent world rollouts where coherence across time matters as much as single-frame fidelity
In practice, this means the model is intended to be used inside an interactive runtime that can condition generation on previous frames, and live control inputs.
Intended Use
This model is intended for:
- Research on real-time world models and interactive video generation
- Prototyping AI-native game and simulation experiences
- Creative tools for interactive environments, world exploration, and live generative scenes
- Experimentation with low-latency generative systems on local hardware
- Education and research into control-conditioned video generation
Out-of-Scope Use
This model is not intended for:
- Generating illegal content or content that exploits, sexualizes, or endangers minors
- Generating non-consensual sexual content or explicit sexual content where prohibited
- Impersonation, harassment, or deceptive identity-based content
- Generating copyrighted characters, branded IP, or celebrity likenesses in ways that infringe rights or violate platform rules
- Safety-critical decision-making, surveillance, or high-stakes automated systems
- Any deployment that removes reasonable safeguards while serving end users at scale
Usage
This checkpoint is intended to be used with Overworld’s interactive runtime stack.
- Play on our official desktop client, Biome
- Use our world_engine inference library to build your own applications
Architecture
- Backbone: Autoregressive Diffusion Transformer
- Autoencoder: Tiny Hunyuan Autoencoder (taehv1_5) — 4x temporal compression, 8x spatial compression, 32 latent channels
Training Data
Waypoint-1.5 was trained on nearly 100× more data than Waypoint-1, with the release emphasizing better coherence, motion consistency, and broader hardware accessibility.
Limitations
This model has important limitations.
- It is a generative world model, not a simulator with guaranteed physical accuracy.
- Long interactive rollouts may drift, collapse, or become inconsistent.
- The model may produce unstable geometry, object persistence failures, or implausible motion.
- Performance is hardware-dependent and may vary significantly by runtime stack and settings.
- Safety mitigations available in hosted deployments may not transfer fully to raw checkpoint use.
- Outputs may reflect biases, omissions, or unsafe patterns present in training data or learned world priors.
Safety
Please see our blog post, "Engineering Safety for Interactive World Models" for details.
Contact
- Downloads last month
- -