Spaces:
Build error
Build error
File size: 2,967 Bytes
e526e6a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 |
# SPIRAL: Interactive Reasoning Game Simulator
A practical, interactive tool based on the SPIRAL paper ("Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning") deployed on Hugging Face Spaces.
## Overview
This tool demonstrates how self-play training on zero-sum games can improve AI reasoning capabilities. Users can:
- **Play Games**: Engage with AI in games like Kuhn Poker and TicTacToe
- **View Reasoning**: See step-by-step AI reasoning traces during gameplay
- **Test Transfer**: Evaluate AI's reasoning skills on math problems and logic puzzles
- **Learn**: Understand AI decision-making through interactive visualizations
## Features
### For Non-Technical Users
- Simple web interface for playing games
- Visual reasoning explanations
- Educational tutorials about AI thinking
- No setup required - runs in browser
### For Technical Users
- Access to model weights and training scripts
- API endpoints for extending the system
- Custom game integration capabilities
- Fine-tuning examples and documentation
## Project Structure
```
SPIRAL/
βββ src/ # Core implementation
β βββ games/ # Game environments
β βββ models/ # SPIRAL model implementation
β βββ training/ # Self-play training logic
β βββ reasoning/ # Reasoning trace generation
βββ models/ # Trained model weights
βββ data/ # Game datasets and benchmarks
βββ app/ # Gradio web interface
βββ tests/ # Unit and integration tests
βββ docs/ # Documentation and tutorials
```
## Technology Stack
- **Backend**: Python 3.8+
- **ML Framework**: PyTorch, Transformers
- **RL Library**: Gymnasium, Stable Baselines3
- **Web Interface**: Gradio
- **Base Model**: Qwen-4B from Hugging Face
- **Deployment**: Hugging Face Spaces
## Development Phases
1. **Research and Planning** β
2. **Implementation** π
3. **Testing and Optimization** π
4. **Deployment and Documentation** π
5. **Maintenance and Iteration** π
## Getting Started
### Prerequisites
- Python 3.8+
- PyTorch
- Hugging Face account (for model access)
### Installation
```bash
pip install -r requirements.txt
```
### Quick Start
```bash
python app/app.py
```
## Citation
If you use this tool in your research, please cite the original SPIRAL paper:
```bibtex
@article{spiral2024,
title={Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning},
author={[Authors]},
journal={[Journal]},
year={2024}
}
```
## License
This project is licensed under the MIT License - see the LICENSE file for details.
## Contributing
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
## Support
For issues and questions, please use the GitHub Issues or contact us via Hugging Face Spaces. |