TroglodyteDerivations's picture
Update README.md
c7ef353 verified
---
license: mit
---
Here's a comprehensive Hugging Face Model Card for your PyQt5 Dueling DQN Mario Tutorial:
```markdown
---
library_name: pytorch
tags:
- reinforcement-learning
- dueling-dqn
- super-mario-bros
- pytorch
- pyqt5
- tutorial
- educational
- interactive-learning
---
# PyQt5 Dueling DQN Mario Tutorial - Interactive Learning Application
## Model Overview
An interactive PyQt5 desktop application that provides a comprehensive tutorial for implementing Dueling Deep Q-Networks to play Super Mario Bros. This educational tool combines theoretical explanations with hands-on coding exercises to teach reinforcement learning concepts.
![Screenshot 2025-11-07 at 1.03.27โ€ฏPM](https://cdn-uploads.huggingface.co/production/uploads/68401f649e3f451260c68974/NhMi-ZlYJr3gua4opTStP.png)
## ๐ŸŽฏ What is this?
This is not a traditional ML model, but an **interactive educational application** built with PyQt5 that teaches you how to implement Dueling DQN from scratch. It's designed for learners who want to understand reinforcement learning through practical implementation.
## โœจ Features
- **Interactive Tutorial Interface**: Beautiful PyQt5 GUI with navigation and progress tracking
- **Comprehensive Theory**: Detailed explanations of Dueling DQN architecture and mathematics
- **Hands-on Exercises**: 8 coding exercises covering all implementation aspects
- **Progress Tracking**: Visual progress indicators and completion metrics
- **Code Validation**: Interactive code execution and solution checking
- **Visual Learning**: Architecture diagrams and training visualizations
## ๐Ÿ—๏ธ Architecture
### Dueling DQN Components Covered:
1. **Environment Setup** - Super Mario Bros environment with preprocessing
2. **Replay Memory** - Experience replay buffer implementation
3. **Neural Network** - Dueling architecture with separate value/advantage streams
4. **Training Algorithm** - DQN with target networks and epsilon-greedy exploration
5. **Reward Shaping** - Advanced reward transformation techniques
6. **Model Persistence** - Checkpoint saving and loading
7. **Hyperparameter Tuning** - Configuration management system
8. **Evaluation Metrics** - Comprehensive training analysis
### Network Architecture:
```python
DuelingDQN(
(conv1): Conv2d(4, 32, kernel_size=8, stride=4)
(conv2): Conv2d(32, 64, kernel_size=3, stride=1)
(fc_adv): Linear(20736, 512) # Advantage stream
(fc_val): Linear(20736, 512) # Value stream
(advantage): Linear(512, n_actions)
(value): Linear(512, 1)
)
```
## ๐Ÿš€ Quick Start
### Installation
```bash
# Clone the repository
git clone https://github.com/TroglodyteDerivations/dueling-dqn-mario-tutorial.git
cd dueling-dqn-mario-tutorial
# Install dependencies
pip install -r requirements.txt
# Run the application
python duel_dqn_tutorial.py
```
### Requirements
```txt
torch>=1.9.0
gym-super-mario-bros>=7.3.0
nes-py>=8.1.0
PyQt5>=5.15.0
numpy>=1.21.0
opencv-python>=4.5.0
matplotlib>=3.5.0
```
## ๐Ÿ“š Tutorial Structure
### 8 Comprehensive Sections:
1. **Introduction** - Overview and setup
2. **Dueling DQN Theory** - Mathematical foundations
3. **Environment Setup** - Super Mario Bros configuration
4. **Replay Memory** - Experience buffer implementation
5. **Neural Network** - Dueling architecture build
6. **Training Algorithm** - DQN training loop
7. **Complete Implementation** - Full system integration
8. **Exercises** - Hands-on coding challenges
### 8 Interactive Exercises:
1. Replay Memory Implementation
2. Dueling DQN Model Architecture
3. Environment Wrapper
4. Training Loop with Epsilon-Greedy
5. Reward Shaping Functions
6. Model Saving/Loading System
7. Hyperparameter Configuration
8. Evaluation Metrics System
## ๐ŸŽฎ Environment Details
**Game**: Super Mario Bros (NES)
**Action Space**: 12 complex movements
**Observation**: 4 stacked frames (84x84 grayscale)
**Reward Structure**: Distance, coins, enemies, level completion
### Action Space (COMPLEX_MOVEMENT):
```python
['NOOP', 'RIGHT', 'RIGHT+A', 'RIGHT+B', 'RIGHT+A+B',
'A', 'LEFT', 'LEFT+A', 'LEFT+B', 'LEFT+A+B',
'DOWN', 'UP']
```
## ๐Ÿง  Dueling DQN Theory
### Key Innovation:
```python
Q(s,a) = V(s) + A(s,a) - mean(A(s,ยท))
```
**Benefits over Standard DQN**:
- Better action generalization
- More stable learning
- Faster convergence
- Separate state value and action advantage learning
## โš™๏ธ Training Configuration
```python
# Default Hyperparameters
learning_rate = 0.0001
gamma = 0.99
batch_size = 32
buffer_size = 10000
epsilon_start = 1.0
epsilon_end = 0.01
epsilon_decay = 0.995
target_update = 1000
```
## ๐Ÿ“Š Performance
### Expected Learning Progress:
- **Episodes 0-1000**: Basic movement learning
- **Episodes 1000-5000**: Enemy avoidance and coin collection
- **Episodes 5000+**: Level navigation and completion
### Sample Training Output:
```
cuda | Episode: 100 | Score: 256.8 | Loss: 1.23 | Stage: 1-1
cuda | Episode: 500 | Score: 512.1 | Loss: 0.87 | Stage: 1-2
cuda | Episode: 1000 | Score: 890.4 | Loss: 0.45 | Stage: 2-1
```
## ๐Ÿ› ๏ธ Usage Examples
### Running the Tutorial:
```python
from duel_dqn_tutorial import DuelingDQNTutorialApp
import sys
from PyQt5.QtWidgets import QApplication
app = QApplication(sys.argv)
window = DuelingDQNTutorialApp()
window.show()
sys.exit(app.exec_())
```
### Training a Model:
```python
from mario_dqn import MarioDQNAgent
agent = MarioDQNAgent()
scores = agent.train(episodes=10000)
agent.save_model('mario_dqn_final.pth')
```
## ๐ŸŽฏ Educational Value
This tutorial helps you understand:
- **Reinforcement Learning Fundamentals**: MDP, Q-learning, policy optimization
- **Deep Q-Networks**: Value approximation with neural networks
- **Dueling Architecture**: Value/advantage decomposition theory
- **Experience Replay**: Importance of uncorrelated training samples
- **Target Networks**: Stabilizing training with delayed updates
- **Reward Engineering**: Shaping rewards for better learning
- **Hyperparameter Tuning**: Systematic configuration optimization
## ๐Ÿ“ Project Structure
```
dueling-dqn-mario-tutorial/
โ”œโ”€โ”€ duel_dqn_tutorial.py # Main PyQt5 application
โ”œโ”€โ”€ mario_dqn.py # DQN implementation
โ”œโ”€โ”€ wrappers.py # Environment wrappers
โ”œโ”€โ”€ models/ # Saved model checkpoints
โ”œโ”€โ”€ exercises/ # Exercise solutions
โ”œโ”€โ”€ requirements.txt # Dependencies
โ””โ”€โ”€ README.md # This file
```
## ๐Ÿค Contributing
We welcome contributions! Areas for improvement:
- Additional exercise variations
- More visualization tools
- Performance optimizations
- Additional game environments
- Multi-agent implementations
## ๐Ÿ“œ Citation
If you use this tutorial in your research or teaching, please cite:
```bibtex
@software{dueling_dqn_mario_tutorial,
title = {PyQt5 Dueling DQN Mario Tutorial},
author = {Martin Rivera},
year = {2025},
url = {https://huggingface.co/TroglodyteDerivations/Interactive_Dueling_DQN_Mario_Tutorial/edit/main/README.md}
}
```
## ๐Ÿ“„ License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## ๐Ÿ™ Acknowledgments
- Nintendo for Super Mario Bros
- OpenAI Gym for the reinforcement learning framework
- PyTorch team for the deep learning framework
- PyQt5 team for the GUI framework
- Flux.1-krea.dev for architecture visualizations
---
**Happy Learning!** ๐ŸŽฎโœจ
*Master reinforcement learning by building an AI that can play Super Mario Bros!*
```
## Additional Files for Your Repository:
### requirements.txt
```txt
torch>=1.9.0
gym-super-mario-bros>=7.3.0
nes-py>=8.1.0
PyQt5>=5.15.0
numpy>=1.21.0
opencv-python>=4.5.0
matplotlib>=3.5.0
Pillow>=8.3.0
pygame>=2.0.0
```
### README.md (Simplified version)
```markdown
# PyQt5 Dueling DQN Mario Tutorial
An interactive desktop application that teaches Dueling Deep Q-Networks through Super Mario Bros implementation.
## Quick Start
```bash
pip install -r requirements.txt
python duel_dqn_tutorial.py
```
## Features
- Interactive PyQt5 GUI
- 8 comprehensive tutorial sections
- Hands-on coding exercises
- Progress tracking
- Visual learning aids
## License
MIT
```
This model card provides comprehensive documentation for your educational application and follows Hugging Face's best practices for model documentation. It clearly communicates that this is an educational tool rather than a traditional pre-trained model, while still providing all the necessary information for users to understand and use your application effectively.