anthonym21 commited on
Commit
c4e5414
Β·
verified Β·
1 Parent(s): 3beae40

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +16 -78
README.md CHANGED
@@ -1,14 +1,12 @@
1
  ---
2
  title: World Model Demo
3
  emoji: 🧠
4
- colorFrom: green
5
- colorTo: purple
6
  sdk: gradio
7
- sdk_version: 4.44.0
8
  app_file: app.py
9
  pinned: false
10
- license: mit
11
- short_description: Interactive demo of how AI agents learn to dream and plan
12
  tags:
13
  - world-model
14
  - reinforcement-learning
@@ -17,80 +15,20 @@ tags:
17
  - cognitive-architecture
18
  ---
19
 
20
- # 🧠 World Model Demo
21
 
22
- An interactive visualization demonstrating how intelligent agents build internal models of their world to plan actions without trial-and-error in reality.
23
 
24
- ## The Concept
 
 
 
 
25
 
26
- A **World Model** is an internal simulation that an agent uses to predict the outcomes of its actions. This is how both biological and artificial intelligence can "think ahead" - imagining consequences before committing to actions.
 
 
 
 
27
 
28
- ### The Three Phases
29
-
30
- | Phase | What Happens | Real-World Analogy |
31
- |-------|--------------|-------------------|
32
- | πŸ” **Exploration** | Random movement to discover physics rules | A baby learning to crawl by bumping into things |
33
- | πŸ’­ **Dreaming** | Planning using *only* the internal model | Mentally rehearsing a speech before giving it |
34
- | πŸš€ **Execution** | Following the imagined plan | Actually performing the rehearsed speech |
35
-
36
- ## How to Use
37
-
38
- 1. **Configure** the grid size and obstacle pattern
39
- 2. **Explore** - Watch the agent learn the world's physics through random movement
40
- 3. **Dream** - See the agent plan a path using only its learned model (no real movement!)
41
- 4. **Execute** - Watch the plan work in reality
42
-
43
- Or just click **Run All Phases** to see the complete demonstration!
44
-
45
- ## Technical Details
46
-
47
- The World Model is implemented as a simple dictionary:
48
- ```python
49
- transitions[(state, action)] = next_state
50
- ```
51
-
52
- During the **Dreaming** phase, the agent uses BFS search through this dictionary - it never calls the real environment! This is the key insight: planning happens entirely in the agent's "imagination."
53
-
54
- ## Why This Matters
55
-
56
- This demo illustrates the foundation of modern AI systems:
57
-
58
- - **MuZero** (DeepMind) - Learned world models for game playing
59
- - **Dreamer** - World models for robot control
60
- - **PlaNet** - Planning with learned dynamics
61
- - **Human cognition** - We constantly simulate futures in our minds
62
-
63
- ## Architecture
64
-
65
- ```
66
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
67
- β”‚ INTELLIGENT AGENT β”‚
68
- β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
69
- β”‚ β”‚ ENVIRONMENT │◄──►│ WORLD MODEL │◄──►│ PLANNING β”‚ β”‚
70
- β”‚ β”‚ (Reality) β”‚ β”‚ (Memory) β”‚ β”‚ (Dreams) β”‚ β”‚
71
- β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
72
- β”‚ β”‚ β–² β”‚ β”‚
73
- β”‚ β”‚ learn() β”‚ predict() β”‚ β”‚
74
- β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
75
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
76
- ```
77
-
78
- ## Legend
79
-
80
- - πŸ€– Agent position
81
- - ⭐ Goal location
82
- - 🏁 Start position
83
- - 🧱 Wall/obstacle
84
- - 🟒 Green border = States the model has learned
85
- - 🟣 Purple dashed = States searched during planning
86
- - πŸ”΅ Cyan = Planned path
87
-
88
- ## References
89
-
90
- - Ha, D., & Schmidhuber, J. (2018). World Models
91
- - Hafner, D., et al. (2019). Dream to Control: Learning Behaviors by Latent Imagination
92
- - Schrittwieser, J., et al. (2020). Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
93
-
94
- ---
95
-
96
- *Built with ❀️ using Gradio*
 
1
  ---
2
  title: World Model Demo
3
  emoji: 🧠
4
+ colorFrom: blue
5
+ colorTo: green
6
  sdk: gradio
7
+ sdk_version: 4.44.1
8
  app_file: app.py
9
  pinned: false
 
 
10
  tags:
11
  - world-model
12
  - reinforcement-learning
 
15
  - cognitive-architecture
16
  ---
17
 
18
+ # World Model Demo
19
 
20
+ Interactive demonstration of world model concepts in AI planning and decision-making.
21
 
22
+ ## Features
23
+ - **Visual Grid Environment**: 8x8 grid with obstacles and goals
24
+ - **Phase-Based Learning**: Observe β†’ Plan β†’ Act β†’ Learn cycle
25
+ - **Real-time Statistics**: Track predictions, errors, and model confidence
26
+ - **Educational Overlays**: See how the agent predicts and plans
27
 
28
+ ## Concepts Demonstrated
29
+ - Model-based reinforcement learning (MuZero, Dreamer)
30
+ - World state representation and prediction
31
+ - Planning with learned dynamics models
32
+ - The imagination-execution loop
33
 
34
+ Built for the Anthropic Research Fellowship application.