anm-ol commited on
Commit
c3fa4e8
·
verified ·
1 Parent(s): e32c736

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +128 -1
README.md CHANGED
@@ -1,3 +1,130 @@
1
  ---
2
  license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ library_name: world_engine
4
+ tags:
5
+ - world-model
6
+ - interactive-video
7
+ - generative-worlds
8
+ - real-time
9
+ - consumer-gpu
10
+ - diffusion
11
+ - transformer
12
+ ---
13
+
14
+ ![Overworld](assets/Overworld_Loop.webm)
15
+
16
+ # Waypoint-1.5-1B
17
+
18
+ Waypoint-1.5-1B is the smallest dense model in Overworld’s Waypoint-1.5 family of real-time interactive video world models. Waypoint-1.5 is designed around local, real-time generation on consumer hardware ranging from older RTX 30XX to RTX 50XX.
19
+
20
+ ## Model Details
21
+
22
+ - **Developed by:** Overworld
23
+ - **Model type:** Real-time interactive video world model
24
+ - **Model family:** Waypoint-1.5
25
+ - **Parameter count:** 1.2B
26
+ - **Context length / frame context:** 512 frames
27
+ - **Input modalities:** Starting image or video conditioning, control inputs
28
+ - **Output:** Interactive generated video frames / world rollout
29
+ - **License:** Apache 2
30
+ - **Paper:** Coming soon
31
+ - **Demo:** [overworld.stream](https://www.overworld.stream/)
32
+ - **Runtime / inference library:** [Overworldai/Biome](https://github.com/Overworldai/Biome)
33
+
34
+ ## Model Summary
35
+
36
+ Waypoint-1.5 is Overworld’s next-generation real-time video world model release. It builds on the original Waypoint-1 release by improving visual fidelity, expanding the range of consumer hardware that can run the model, and pushing further toward responsive, interactive world simulation without datacenter-scale compute.
37
+
38
+ At the family level, Waypoint-1.5 targets real-time generation at up to **720p and 60 FPS**, and introduces **two model tiers**: a **720p** model for higher-performance systems and a [**360p** model](https://huggingface.co/Overworld/Waypoint-1.5-1B-360P) intended to run smoothly across a broader range of gaming PCs and Apple Silicon Macs. The release was also trained on **substantially more data than Waypoint-1**, improving coherence and motion consistency over longer interactions.
39
+
40
+ ## What makes Waypoint-1.5 different
41
+
42
+ Waypoint-1.5 is built around a simple product constraint: generative worlds should be usable as **interactive systems**, not just watched as offline demos.
43
+
44
+ Compared with a conventional video generation workflow, the Waypoint family is designed for:
45
+
46
+ - **Real-time interaction** rather than offline batch generation
47
+ - **Low-latency responsiveness** to user inputs
48
+ - **Local execution** on consumer hardware
49
+ - **Persistent world rollouts** where coherence across time matters as much as single-frame fidelity
50
+
51
+ In practice, this means the model is intended to be used inside an interactive runtime that can condition generation on prompt context, previous frames, and live control inputs.
52
+
53
+ ## Intended Use
54
+
55
+ This model is intended for:
56
+
57
+ - Research on real-time world models and interactive video generation
58
+ - Prototyping AI-native game and simulation experiences
59
+ - Creative tools for interactive environments, world exploration, and live generative scenes
60
+ - Experimentation with low-latency generative systems on local hardware
61
+ - Education and research into control-conditioned video generation
62
+
63
+ ## Out-of-Scope Use
64
+
65
+ This model is **not** intended for:
66
+
67
+ - Generating illegal content or content that exploits, sexualizes, or endangers minors
68
+ - Generating non-consensual sexual content or explicit sexual content where prohibited
69
+ - Impersonation, harassment, or deceptive identity-based content
70
+ - Generating copyrighted characters, branded IP, or celebrity likenesses in ways that infringe rights or violate platform rules
71
+ - Safety-critical decision-making, surveillance, or high-stakes automated systems
72
+ - Any deployment that removes reasonable safeguards while serving end users at scale
73
+
74
+ ## Usage
75
+
76
+ This checkpoint is intended to be used with Overworld’s interactive runtime stack.
77
+
78
+ - Use our [world_engine](https://github.com/Wayfarer-Labs/world_engine) inference library directly
79
+ - Play on our official desktop client, [Biome](https://github.com/Overworldai/Biome/)
80
+
81
+
82
+ ### Recommended setup
83
+
84
+ - **Recommended GPU / device:** RTX 5090
85
+ - **Expected FPS on reference hardware:** 56 FPS
86
+
87
+
88
+ ### Architecture
89
+
90
+ - **Backbone:** Autoregressive Diffusion Transformer
91
+ - **Autoencoder:** [Tiny Hunyuan Autoencoder (taehv1_5)](https://github.com/madebyollin/taehv) — 4x temporal compression, 8x spatial compression, 32 latent channels
92
+
93
+ ### Training Data
94
+
95
+ Waypoint-1.5 was trained on **nearly 100× more data than Waypoint-1**, with the release emphasizing better coherence, motion consistency, and broader hardware accessibility.
96
+
97
+ ## Performance
98
+
99
+ ### Waypoint 1 vs Waypoint 1.5
100
+
101
+ | | Waypoint 1 | Waypoint 1.5 |
102
+ |---|---|---|
103
+ | Resolution | 360P | 720P |
104
+ | Context window | 2 seconds | 10 seconds |
105
+ | 4-step unquantized (5090) | 20 FPS | 56 FPS |
106
+ | 4-step w8a8 quantized (5090) | N/A | 72 FPS |
107
+ | 4-step w8a8 (3090) | N/A | 30 FPS |
108
+
109
+ ![Generation Throughput — Waypoint 1 vs Waypoint 1.5](assets/perf_chart.png)
110
+
111
+ ## Limitations
112
+
113
+ This model has important limitations.
114
+
115
+ - It is a generative world model, not a simulator with guaranteed physical accuracy.
116
+ - Long interactive rollouts may drift, collapse, or become inconsistent.
117
+ - The model may produce unstable geometry, object persistence failures, or implausible motion.
118
+ - Performance is hardware-dependent and may vary significantly by runtime stack and settings.
119
+ - Safety mitigations available in hosted deployments may not transfer fully to raw checkpoint use.
120
+ - Outputs may reflect biases, omissions, or unsafe patterns present in training data or learned world priors.
121
+
122
+ ## Safety
123
+
124
+ Please see our blog post, ["Engineering Safety for Interactive World Models"](https://over.world/blog/engineering-safety-for-interactive-world-models) for details.
125
+
126
+ ## Contact
127
+
128
+ - [Website](http://over.world/)
129
+ - [Discord](https://discord.gg/MEmQa7Wux4)
130
+ - [X/Twitter](https://x.com/overworld_ai)