world-model-demo / README.md
anthonym21's picture
Upload folder using huggingface_hub
e345b60 verified
---
title: World Model Demo
emoji: 🧠
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.9.1
app_file: app.py
pinned: false
tags:
- world-model
- reinforcement-learning
- planning
- ai-education
- model-based-rl
- muzero
- dreamer
---
# 🧠 World Model Demo
**An interactive visualization of model-based reinforcement learning concepts**
## What is a World Model?
A **world model** is an internal representation that an AI agent uses to *simulate* the environment without actually interacting with it. Think of it as the agent's "imagination" - it can mentally rehearse actions and predict their outcomes before committing to them in the real world.
### The Key Insight
Instead of learning through pure trial-and-error (which is slow and potentially dangerous), an agent with a world model can:
1. **Imagine** possible futures by simulating "what if I do X?"
2. **Evaluate** which imagined future looks best
3. **Plan** a sequence of actions to reach that future
4. **Act** with confidence, having already "seen" the outcome
## How This Differs from Language Models
| Aspect | Language Model (GPT, Claude) | World Model (MuZero, Dreamer) |
|--------|------------------------------|-------------------------------|
| **Primary function** | Predict next token in a sequence | Predict next *state* given an action |
| **Training signal** | Text prediction loss | Reward from environment |
| **"Imagination"** | Generates plausible text continuations | Simulates future environment states |
| **Planning** | Implicit (via chain-of-thought) | Explicit (via tree search or rollouts) |
| **Grounding** | Statistical patterns in text | Causal dynamics of an environment |
### A Concrete Example
**Language Model**: "If I push a ball off a table, it will..." β†’ generates plausible text based on patterns
**World Model**: Given state (ball on table) + action (push) β†’ predicts new state (ball falling, trajectory, landing position) with enough fidelity to *plan* around it
## What You're Seeing in This Demo
This visualization shows a simplified world model operating on a grid navigation task:
### The Four Phases
1. **πŸ” Observe**: The agent perceives the current grid state (its position, goal location, obstacles)
2. **πŸ’­ Imagine**: The world model predicts what would happen for each possible action (up/down/left/right). You see this as the "mental simulation" exploring future states.
3. **🌳 Plan**: Using tree search (similar to how chess engines work), the agent evaluates sequences of actions by imagining multiple steps ahead. Better paths to the goal get higher scores.
4. **⚑ Act**: The agent executes the best action found during planning, then the cycle repeats.
### Why This Matters for AI Safety
World models are crucial for AI safety research because:
- **Predictability**: Agents that plan can be analyzed - we can inspect what futures they're considering
- **Corrigibility**: Planning agents can incorporate "don't do irreversible things" into their search
- **Interpretability**: The world model's predictions can be examined for accuracy and bias
- **Scalable oversight**: Humans can audit the agent's "reasoning" by inspecting its simulated futures
## Real-World Architectures
This demo is inspired by:
- **MuZero** (DeepMind): Learned world models that mastered Go, chess, and Atari without knowing the rules
- **Dreamer** (Hafner et al.): World models for continuous control from pixels
- **IRIS** (Micheli et al.): Transformer-based world models for Atari
- **Genie** (DeepMind): Generative world models from video
## Try It Yourself
1. Click **"Run World Model"** to watch the full planning cycle
2. Use **Step Mode** to see each phase individually
3. Adjust grid size and obstacles to see how planning adapts
4. Watch the **Imagined Futures** panel to see the agent's "thoughts"
---
*Created by [Anthony Maio](https://huggingface.co/anthonym21) as an educational resource for AI safety research*