File size: 6,217 Bytes

cf6247c
 
c3fa4e8
 
 
 
 
 
 
 
 
 
 
770364c
c3fa4e8
 
 
8f1b6d8
c3fa4e8
 
 
 
 
8f1b6d8
c3fa4e8
 
8f1b6d8
c3fa4e8
 
 
6fb90b1
205c539
8f1b6d8
c3fa4e8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8f1b6d8
c3fa4e8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d39ff00
73ad515
c3fa4e8
 
 
 
 
 
8f1b6d8
c3fa4e8

---
license: apache-2.0
library_name: world_engine
tags:
  - world-model
  - interactive-video
  - generative-worlds
  - real-time
  - consumer-gpu
  - diffusion
  - transformer
---

<video src="https://huggingface.co/Overworld/Waypoint-1.5-1B/resolve/main/assets/wp_1.5.mp4" controls autoplay loop muted playsinline width="100%"></video>

# Waypoint-1.5-1B

Waypoint-1.5-1B is the smallest dense model in Overworld’s Waypoint-1.5 family of real-time interactive video world models. Waypoint-1.5 is designed around local, real-time generation on consumer hardware ranging from the most advanced RTX 50 series cards, to older RTX 30 series cards.

## Model Details

- **Developed by:** Overworld
- **Model type:** Real-time interactive video world model
- **Model family:** [Waypoint-1.5](https://huggingface.co/collections/Overworld/waypoint-15)
- **Parameter count:** 1.2B
- **Context length / frame context:** 512 frames
- **Input modalities:** Starting image or video conditioning, keyboard / mouse inputs
- **Output:** Interactive generated video frames / world rollout
- **License:** Apache 2
- **Paper:** Coming soon
- **Streaming Demo:** [Overworld Stream](https://www.overworld.stream/)
- **Desktop Client:** [Biome](https://over.world/install)
- **Core Inference Library:** [Overworldai/world_engine](https://github.com/Wayfarer-Labs/world_engine)

## Model Summary

Waypoint-1.5 is Overworld’s next-generation real-time video world model release. It builds on the original Waypoint-1 release by improving visual fidelity, expanding the range of consumer hardware that can run the model, and pushing further toward responsive, interactive world simulation without datacenter-scale compute.

At the family level, Waypoint-1.5 targets real-time generation at up to **720p and 60 FPS**, and introduces **two model tiers**: a **720p** model for higher-performance systems and a [**360p** model](https://huggingface.co/Overworld/Waypoint-1.5-1B-360P) intended to run smoothly across a broader range of gaming PCs and Apple Silicon Macs. The release was also trained on **substantially more data than Waypoint-1**, improving coherence and motion consistency over longer interactions.

## What makes Waypoint-1.5 different

Waypoint-1.5 is built around a simple product constraint: generative worlds should be usable as **interactive systems**, not just watched as offline demos.

Compared with a conventional video generation workflow, the Waypoint family is designed for:

- **Real-time interaction** rather than offline batch generation
- **Low-latency responsiveness** to user inputs
- **Local execution** on consumer hardware
- **Persistent world rollouts** where coherence across time matters as much as single-frame fidelity

In practice, this means the model is intended to be used inside an interactive runtime that can condition generation on previous frames, and live control inputs.

## Intended Use

This model is intended for:

- Research on real-time world models and interactive video generation
- Prototyping AI-native game and simulation experiences
- Creative tools for interactive environments, world exploration, and live generative scenes
- Experimentation with low-latency generative systems on local hardware
- Education and research into control-conditioned video generation

## Out-of-Scope Use

This model is **not** intended for:

- Generating illegal content or content that exploits, sexualizes, or endangers minors
- Generating non-consensual sexual content or explicit sexual content where prohibited
- Impersonation, harassment, or deceptive identity-based content
- Generating copyrighted characters, branded IP, or celebrity likenesses in ways that infringe rights or violate platform rules
- Safety-critical decision-making, surveillance, or high-stakes automated systems
- Any deployment that removes reasonable safeguards while serving end users at scale

## Usage

This checkpoint is intended to be used with Overworld’s interactive runtime stack.

- Play on our official desktop client, [Biome](https://over.world/install)
- Use our [world_engine](https://github.com/Wayfarer-Labs/world_engine) inference library to build your own applications


### Recommended setup

- **Recommended GPU / device:** RTX 5090
- **Expected FPS on reference hardware:** 56 FPS
- **Supported GPUs:** Desktop RTX 30 Series and later. For weaker hardware, you may run [Overworld/Waypoint-1.5-1B-360P](https://huggingface.co/Overworld/Waypoint-1.5-1B-360P)


### Architecture

- **Backbone:** Autoregressive Diffusion Transformer
- **Autoencoder:** [Tiny Hunyuan Autoencoder (taehv1_5)](https://github.com/madebyollin/taehv) — 4x temporal compression, 8x spatial compression, 32 latent channels

### Training Data

Waypoint-1.5 was trained on **nearly 100× more data than Waypoint-1**, with the release emphasizing better coherence, motion consistency, and broader hardware accessibility.

## Performance

### Waypoint 1 vs Waypoint 1.5

| | Waypoint 1 | Waypoint 1.5 |
|---|---|---|
| Resolution | 360P | 720P |
| Context window | 2 seconds | 10 seconds |
| 4-step unquantized (5090) | 20 FPS | 56 FPS |
| 4-step w8a8 quantized (5090) | N/A | 72 FPS |
| 4-step w8a8 (3090) | N/A | 30 FPS |

![Generation Throughput — Waypoint 1 vs Waypoint 1.5](assets/perf_chart.png)

## Limitations

This model has important limitations.

- It is a generative world model, not a simulator with guaranteed physical accuracy.
- Long interactive rollouts may drift, collapse, or become inconsistent.
- The model may produce unstable geometry, object persistence failures, or implausible motion.
- Performance is hardware-dependent and may vary significantly by runtime stack and settings.
- Safety mitigations available in hosted deployments may not transfer fully to raw checkpoint use.
- Outputs may reflect biases, omissions, or unsafe patterns present in training data or learned world priors.

## Safety

Please see our blog post, ["Engineering Safety for Interactive World Models"](https://over.world/blog/engineering-safety-for-interactive-world-models) for details.

## Contact

- [Website](http://over.world/)
- [Discord](https://discord.gg/MEmQa7Wux4)
- [X/Twitter](https://x.com/overworld_ai)