File size: 8,570 Bytes

c7ef353

---
license: mit
---
Here's a comprehensive Hugging Face Model Card for your PyQt5 Dueling DQN Mario Tutorial:

```markdown
---
library_name: pytorch
tags:
- reinforcement-learning
- dueling-dqn
- super-mario-bros
- pytorch
- pyqt5
- tutorial
- educational
- interactive-learning
---

# PyQt5 Dueling DQN Mario Tutorial - Interactive Learning Application

## Model Overview

An interactive PyQt5 desktop application that provides a comprehensive tutorial for implementing Dueling Deep Q-Networks to play Super Mario Bros. This educational tool combines theoretical explanations with hands-on coding exercises to teach reinforcement learning concepts.


![Screenshot 2025-11-07 at 1.03.27 PM](https://cdn-uploads.huggingface.co/production/uploads/68401f649e3f451260c68974/NhMi-ZlYJr3gua4opTStP.png)

## 🎯 What is this?

This is not a traditional ML model, but an **interactive educational application** built with PyQt5 that teaches you how to implement Dueling DQN from scratch. It's designed for learners who want to understand reinforcement learning through practical implementation.

## ✨ Features

- **Interactive Tutorial Interface**: Beautiful PyQt5 GUI with navigation and progress tracking
- **Comprehensive Theory**: Detailed explanations of Dueling DQN architecture and mathematics
- **Hands-on Exercises**: 8 coding exercises covering all implementation aspects
- **Progress Tracking**: Visual progress indicators and completion metrics
- **Code Validation**: Interactive code execution and solution checking
- **Visual Learning**: Architecture diagrams and training visualizations

## 🏗️ Architecture

### Dueling DQN Components Covered:

1. **Environment Setup** - Super Mario Bros environment with preprocessing
2. **Replay Memory** - Experience replay buffer implementation
3. **Neural Network** - Dueling architecture with separate value/advantage streams
4. **Training Algorithm** - DQN with target networks and epsilon-greedy exploration
5. **Reward Shaping** - Advanced reward transformation techniques
6. **Model Persistence** - Checkpoint saving and loading
7. **Hyperparameter Tuning** - Configuration management system
8. **Evaluation Metrics** - Comprehensive training analysis

### Network Architecture:
```python
DuelingDQN(
  (conv1): Conv2d(4, 32, kernel_size=8, stride=4)
  (conv2): Conv2d(32, 64, kernel_size=3, stride=1)
  (fc_adv): Linear(20736, 512)  # Advantage stream
  (fc_val): Linear(20736, 512)  # Value stream
  (advantage): Linear(512, n_actions)
  (value): Linear(512, 1)
)
```

## 🚀 Quick Start

### Installation

```bash
# Clone the repository
git clone https://github.com/TroglodyteDerivations/dueling-dqn-mario-tutorial.git
cd dueling-dqn-mario-tutorial

# Install dependencies
pip install -r requirements.txt

# Run the application
python duel_dqn_tutorial.py
```

### Requirements

```txt
torch>=1.9.0
gym-super-mario-bros>=7.3.0
nes-py>=8.1.0
PyQt5>=5.15.0
numpy>=1.21.0
opencv-python>=4.5.0
matplotlib>=3.5.0
```

## 📚 Tutorial Structure

### 8 Comprehensive Sections:

1. **Introduction** - Overview and setup
2. **Dueling DQN Theory** - Mathematical foundations
3. **Environment Setup** - Super Mario Bros configuration
4. **Replay Memory** - Experience buffer implementation
5. **Neural Network** - Dueling architecture build
6. **Training Algorithm** - DQN training loop
7. **Complete Implementation** - Full system integration
8. **Exercises** - Hands-on coding challenges

### 8 Interactive Exercises:

1. Replay Memory Implementation
2. Dueling DQN Model Architecture
3. Environment Wrapper
4. Training Loop with Epsilon-Greedy
5. Reward Shaping Functions
6. Model Saving/Loading System
7. Hyperparameter Configuration
8. Evaluation Metrics System

## 🎮 Environment Details

**Game**: Super Mario Bros (NES)  
**Action Space**: 12 complex movements  
**Observation**: 4 stacked frames (84x84 grayscale)  
**Reward Structure**: Distance, coins, enemies, level completion

### Action Space (COMPLEX_MOVEMENT):
```python
['NOOP', 'RIGHT', 'RIGHT+A', 'RIGHT+B', 'RIGHT+A+B', 
 'A', 'LEFT', 'LEFT+A', 'LEFT+B', 'LEFT+A+B', 
 'DOWN', 'UP']
```

## 🧠 Dueling DQN Theory

### Key Innovation:
```python
Q(s,a) = V(s) + A(s,a) - mean(A(s,·))
```

**Benefits over Standard DQN**:
- Better action generalization
- More stable learning
- Faster convergence
- Separate state value and action advantage learning

## ⚙️ Training Configuration

```python
# Default Hyperparameters
learning_rate = 0.0001
gamma = 0.99
batch_size = 32
buffer_size = 10000
epsilon_start = 1.0
epsilon_end = 0.01
epsilon_decay = 0.995
target_update = 1000
```

## 📊 Performance

### Expected Learning Progress:
- **Episodes 0-1000**: Basic movement learning
- **Episodes 1000-5000**: Enemy avoidance and coin collection
- **Episodes 5000+**: Level navigation and completion

### Sample Training Output:
```
cuda | Episode: 100 | Score: 256.8 | Loss: 1.23 | Stage: 1-1
cuda | Episode: 500 | Score: 512.1 | Loss: 0.87 | Stage: 1-2
cuda | Episode: 1000 | Score: 890.4 | Loss: 0.45 | Stage: 2-1
```

## 🛠️ Usage Examples

### Running the Tutorial:
```python
from duel_dqn_tutorial import DuelingDQNTutorialApp
import sys
from PyQt5.QtWidgets import QApplication

app = QApplication(sys.argv)
window = DuelingDQNTutorialApp()
window.show()
sys.exit(app.exec_())
```

### Training a Model:
```python
from mario_dqn import MarioDQNAgent

agent = MarioDQNAgent()
scores = agent.train(episodes=10000)
agent.save_model('mario_dqn_final.pth')
```

## 🎯 Educational Value

This tutorial helps you understand:

- **Reinforcement Learning Fundamentals**: MDP, Q-learning, policy optimization
- **Deep Q-Networks**: Value approximation with neural networks
- **Dueling Architecture**: Value/advantage decomposition theory
- **Experience Replay**: Importance of uncorrelated training samples
- **Target Networks**: Stabilizing training with delayed updates
- **Reward Engineering**: Shaping rewards for better learning
- **Hyperparameter Tuning**: Systematic configuration optimization

## 📁 Project Structure

```
dueling-dqn-mario-tutorial/
├── duel_dqn_tutorial.py      # Main PyQt5 application
├── mario_dqn.py             # DQN implementation
├── wrappers.py              # Environment wrappers
├── models/                  # Saved model checkpoints
├── exercises/               # Exercise solutions
├── requirements.txt         # Dependencies
└── README.md               # This file
```

## 🤝 Contributing

We welcome contributions! Areas for improvement:

- Additional exercise variations
- More visualization tools
- Performance optimizations
- Additional game environments
- Multi-agent implementations

## 📜 Citation

If you use this tutorial in your research or teaching, please cite:

```bibtex
@software{dueling_dqn_mario_tutorial,
  title = {PyQt5 Dueling DQN Mario Tutorial},
  author = {Martin Rivera},
  year = {2025},
  url = {https://huggingface.co/TroglodyteDerivations/Interactive_Dueling_DQN_Mario_Tutorial/edit/main/README.md}
}
```

## 📄 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## 🙏 Acknowledgments

- Nintendo for Super Mario Bros
- OpenAI Gym for the reinforcement learning framework
- PyTorch team for the deep learning framework
- PyQt5 team for the GUI framework
- Flux.1-krea.dev for architecture visualizations

---

**Happy Learning!** 🎮✨

*Master reinforcement learning by building an AI that can play Super Mario Bros!*
```

## Additional Files for Your Repository:

### requirements.txt
```txt
torch>=1.9.0
gym-super-mario-bros>=7.3.0
nes-py>=8.1.0
PyQt5>=5.15.0
numpy>=1.21.0
opencv-python>=4.5.0
matplotlib>=3.5.0
Pillow>=8.3.0
pygame>=2.0.0
```

### README.md (Simplified version)
```markdown
# PyQt5 Dueling DQN Mario Tutorial

An interactive desktop application that teaches Dueling Deep Q-Networks through Super Mario Bros implementation.

## Quick Start
```bash
pip install -r requirements.txt
python duel_dqn_tutorial.py
```

## Features
- Interactive PyQt5 GUI
- 8 comprehensive tutorial sections
- Hands-on coding exercises
- Progress tracking
- Visual learning aids

## License
MIT
```

This model card provides comprehensive documentation for your educational application and follows Hugging Face's best practices for model documentation. It clearly communicates that this is an educational tool rather than a traditional pre-trained model, while still providing all the necessary information for users to understand and use your application effectively.