Spaces:

anthonym21
/

world-model-demo

Sleeping

App Files Files Community

anthonym21 commited on 8 days ago

Commit

c4e5414

verified ·

1 Parent(s): 3beae40

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +16 -78

README.md CHANGED Viewed

@@ -1,14 +1,12 @@
 ---
 title: World Model Demo
 emoji: 🧠
-colorFrom: green
-colorTo: purple
 sdk: gradio
-sdk_version: 4.44.0
 app_file: app.py
 pinned: false
-license: mit
-short_description: Interactive demo of how AI agents learn to dream and plan
 tags:
   - world-model
   - reinforcement-learning
@@ -17,80 +15,20 @@ tags:
   - cognitive-architecture
 ---
-# 🧠 World Model Demo
-An interactive visualization demonstrating how intelligent agents build internal models of their world to plan actions without trial-and-error in reality.
-## The Concept
-A **World Model** is an internal simulation that an agent uses to predict the outcomes of its actions. This is how both biological and artificial intelligence can "think ahead" - imagining consequences before committing to actions.
-### The Three Phases
-| Phase | What Happens | Real-World Analogy |
-|-------|--------------|-------------------|
-| 🔍 **Exploration** | Random movement to discover physics rules | A baby learning to crawl by bumping into things |
-| 💭 **Dreaming** | Planning using *only* the internal model | Mentally rehearsing a speech before giving it |
-| 🚀 **Execution** | Following the imagined plan | Actually performing the rehearsed speech |
-## How to Use
-1. **Configure** the grid size and obstacle pattern
-2. **Explore** - Watch the agent learn the world's physics through random movement
-3. **Dream** - See the agent plan a path using only its learned model (no real movement!)
-4. **Execute** - Watch the plan work in reality
-Or just click **Run All Phases** to see the complete demonstration!
-## Technical Details
-The World Model is implemented as a simple dictionary:
-```python
-transitions[(state, action)] = next_state
-```
-During the **Dreaming** phase, the agent uses BFS search through this dictionary - it never calls the real environment! This is the key insight: planning happens entirely in the agent's "imagination."
-## Why This Matters
-This demo illustrates the foundation of modern AI systems:
-- **MuZero** (DeepMind) - Learned world models for game playing
-- **Dreamer** - World models for robot control
-- **PlaNet** - Planning with learned dynamics
-- **Human cognition** - We constantly simulate futures in our minds
-## Architecture
-```
-┌─────────────────────────────────────────────────────────────┐
-│                    INTELLIGENT AGENT                        │
-│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐     │
-│  │ ENVIRONMENT │◄──►│ WORLD MODEL │◄──►│  PLANNING   │     │
-│  │  (Reality)  │    │  (Memory)   │    │  (Dreams)   │     │
-│  └─────────────┘    └─────────────┘    └─────────────┘     │
-│         │                  ▲                   │            │
-│         │    learn()       │      predict()    │            │
-│         └──────────────────┴───────────────────┘            │
-└─────────────────────────────────────────────────────────────┘
-```
-## Legend
-- 🤖 Agent position
-- ⭐ Goal location
-- 🏁 Start position
-- 🧱 Wall/obstacle
-- 🟢 Green border = States the model has learned
-- 🟣 Purple dashed = States searched during planning
-- 🔵 Cyan = Planned path
-## References
-- Ha, D., & Schmidhuber, J. (2018). World Models
-- Hafner, D., et al. (2019). Dream to Control: Learning Behaviors by Latent Imagination
-- Schrittwieser, J., et al. (2020). Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
----
-*Built with ❤️ using Gradio*

 ---
 title: World Model Demo
 emoji: 🧠
+colorFrom: blue
+colorTo: green
 sdk: gradio
+sdk_version: 4.44.1
 app_file: app.py
 pinned: false
 tags:
   - world-model
   - reinforcement-learning
   - cognitive-architecture
 ---
+# World Model Demo
+Interactive demonstration of world model concepts in AI planning and decision-making.
+## Features
+- **Visual Grid Environment**: 8x8 grid with obstacles and goals
+- **Phase-Based Learning**: Observe → Plan → Act → Learn cycle
+- **Real-time Statistics**: Track predictions, errors, and model confidence
+- **Educational Overlays**: See how the agent predicts and plans
+## Concepts Demonstrated
+- Model-based reinforcement learning (MuZero, Dreamer)
+- World state representation and prediction
+- Planning with learned dynamics models
+- The imagination-execution loop
+Built for the Anthropic Research Fellowship application.