File size: 3,292 Bytes
185e9d2
 
 
 
 
 
 
 
 
 
 
 
 
842d62b
e526e6a
842d62b
e526e6a
842d62b
e526e6a
842d62b
e526e6a
842d62b
e526e6a
842d62b
e526e6a
842d62b
 
 
 
e526e6a
842d62b
 
 
 
e526e6a
842d62b
 
 
 
e526e6a
842d62b
e526e6a
842d62b
 
 
 
 
e526e6a
842d62b
 
e526e6a
842d62b
 
 
e526e6a
842d62b
 
 
e526e6a
842d62b
e526e6a
842d62b
 
 
 
e526e6a
842d62b
e526e6a
842d62b
e526e6a
842d62b
 
 
 
e526e6a
842d62b
e526e6a
842d62b
 
 
 
 
e526e6a
842d62b
e526e6a
842d62b
 
 
 
e526e6a
842d62b
e526e6a
842d62b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
---
title: Reasoning Simulator
emoji: ๐Ÿ†
colorFrom: gray
colorTo: blue
sdk: gradio
sdk_version: 5.36.2
app_file: app.py
pinned: false
license: apache-2.0
short_description: An interactive reasoning game simulator
---

# SPIRAL: Self-Play Reasoning Demo

**Demonstrating how strategic reasoning emerges from self-play in zero-sum games**

Based on: *"Self-Play in Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning"*

## ๐ŸŽฎ Interactive Demo

This simplified demo showcases the key concepts from the SPIRAL research through an interactive TicTacToe game. Watch as the AI demonstrates strategic reasoning using minimax tree search and explains its decision-making process.

## ๐Ÿง  Key Concepts Demonstrated

### Strategic Reasoning
- AI uses minimax tree search to evaluate all possible future moves
- Demonstrates how optimal strategies emerge from competitive gameplay
- Shows explicit reasoning explanations for each move

### Self-Play Learning Principles
- Zero-sum games create competitive pressure that incentivizes strategic thinking
- Multi-agent interactions naturally develop intelligent behavior
- Strategic patterns emerge from repeated competitive gameplay

### Tree Search & Planning
- Minimax algorithm demonstrates formalized strategic reasoning
- Look-ahead planning to evaluate future game states
- Optimal decision-making under competitive constraints

## ๐Ÿš€ Running the Demo

### Local Setup
```bash
# Clone the repository
git clone https://huggingface.co/spaces/kaushikvr06/reasoning-simulator
cd reasoning-simulator

# Install dependencies
pip install -r requirements.txt

# Run the demo
python app.py
```

### Hugging Face Spaces
The demo is deployed and ready to use at:
[https://huggingface.co/spaces/kaushikvr06/reasoning-simulator](https://huggingface.co/spaces/kaushikvr06/reasoning-simulator)

## ๐Ÿ“ How It Works

1. **Human Move**: Click any square to make your move as X
2. **AI Analysis**: The AI analyzes the game tree using minimax search
3. **Strategic Reasoning**: Watch the AI explain its decision-making process
4. **Optimal Play**: The AI chooses the move that maximizes its winning probability

## ๐Ÿ”ฌ Research Connection

This demo illustrates core findings from the SPIRAL methodology:

- **Zero-sum competitive environments** naturally incentivize strategic reasoning
- **Multi-turn planning** emerges from the need to anticipate opponent moves
- **Strategic reasoning capabilities** developed through self-play can transfer to general reasoning tasks
- **Tree search algorithms** formalize the strategic reasoning process

## ๐ŸŽฏ Educational Value

Perfect for:
- Understanding strategic AI decision-making
- Learning about game theory and minimax algorithms
- Exploring the connection between competition and intelligence
- Visualizing how reasoning emerges from strategic gameplay

## ๐Ÿ“Š Technical Details

- **Game Environment**: Clean TicTacToe implementation with proper state management
- **AI Strategy**: Minimax algorithm with optimal move selection
- **Reasoning Display**: Generated explanations of AI strategic thinking
- **Interactive Interface**: Real-time game state updates and move explanations

---

*Experience firsthand how strategic reasoning emerges from competitive self-play!*