TroglodyteDerivations commited on
Commit
c7ef353
·
verified ·
1 Parent(s): da941be

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +302 -3
README.md CHANGED
@@ -1,3 +1,302 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+ Here's a comprehensive Hugging Face Model Card for your PyQt5 Dueling DQN Mario Tutorial:
5
+
6
+ ```markdown
7
+ ---
8
+ library_name: pytorch
9
+ tags:
10
+ - reinforcement-learning
11
+ - dueling-dqn
12
+ - super-mario-bros
13
+ - pytorch
14
+ - pyqt5
15
+ - tutorial
16
+ - educational
17
+ - interactive-learning
18
+ ---
19
+
20
+ # PyQt5 Dueling DQN Mario Tutorial - Interactive Learning Application
21
+
22
+ ## Model Overview
23
+
24
+ An interactive PyQt5 desktop application that provides a comprehensive tutorial for implementing Dueling Deep Q-Networks to play Super Mario Bros. This educational tool combines theoretical explanations with hands-on coding exercises to teach reinforcement learning concepts.
25
+
26
+
27
+ ![Screenshot 2025-11-07 at 1.03.27 PM](https://cdn-uploads.huggingface.co/production/uploads/68401f649e3f451260c68974/NhMi-ZlYJr3gua4opTStP.png)
28
+
29
+ ## 🎯 What is this?
30
+
31
+ This is not a traditional ML model, but an **interactive educational application** built with PyQt5 that teaches you how to implement Dueling DQN from scratch. It's designed for learners who want to understand reinforcement learning through practical implementation.
32
+
33
+ ## ✨ Features
34
+
35
+ - **Interactive Tutorial Interface**: Beautiful PyQt5 GUI with navigation and progress tracking
36
+ - **Comprehensive Theory**: Detailed explanations of Dueling DQN architecture and mathematics
37
+ - **Hands-on Exercises**: 8 coding exercises covering all implementation aspects
38
+ - **Progress Tracking**: Visual progress indicators and completion metrics
39
+ - **Code Validation**: Interactive code execution and solution checking
40
+ - **Visual Learning**: Architecture diagrams and training visualizations
41
+
42
+ ## 🏗️ Architecture
43
+
44
+ ### Dueling DQN Components Covered:
45
+
46
+ 1. **Environment Setup** - Super Mario Bros environment with preprocessing
47
+ 2. **Replay Memory** - Experience replay buffer implementation
48
+ 3. **Neural Network** - Dueling architecture with separate value/advantage streams
49
+ 4. **Training Algorithm** - DQN with target networks and epsilon-greedy exploration
50
+ 5. **Reward Shaping** - Advanced reward transformation techniques
51
+ 6. **Model Persistence** - Checkpoint saving and loading
52
+ 7. **Hyperparameter Tuning** - Configuration management system
53
+ 8. **Evaluation Metrics** - Comprehensive training analysis
54
+
55
+ ### Network Architecture:
56
+ ```python
57
+ DuelingDQN(
58
+ (conv1): Conv2d(4, 32, kernel_size=8, stride=4)
59
+ (conv2): Conv2d(32, 64, kernel_size=3, stride=1)
60
+ (fc_adv): Linear(20736, 512) # Advantage stream
61
+ (fc_val): Linear(20736, 512) # Value stream
62
+ (advantage): Linear(512, n_actions)
63
+ (value): Linear(512, 1)
64
+ )
65
+ ```
66
+
67
+ ## 🚀 Quick Start
68
+
69
+ ### Installation
70
+
71
+ ```bash
72
+ # Clone the repository
73
+ git clone https://github.com/TroglodyteDerivations/dueling-dqn-mario-tutorial.git
74
+ cd dueling-dqn-mario-tutorial
75
+
76
+ # Install dependencies
77
+ pip install -r requirements.txt
78
+
79
+ # Run the application
80
+ python duel_dqn_tutorial.py
81
+ ```
82
+
83
+ ### Requirements
84
+
85
+ ```txt
86
+ torch>=1.9.0
87
+ gym-super-mario-bros>=7.3.0
88
+ nes-py>=8.1.0
89
+ PyQt5>=5.15.0
90
+ numpy>=1.21.0
91
+ opencv-python>=4.5.0
92
+ matplotlib>=3.5.0
93
+ ```
94
+
95
+ ## 📚 Tutorial Structure
96
+
97
+ ### 8 Comprehensive Sections:
98
+
99
+ 1. **Introduction** - Overview and setup
100
+ 2. **Dueling DQN Theory** - Mathematical foundations
101
+ 3. **Environment Setup** - Super Mario Bros configuration
102
+ 4. **Replay Memory** - Experience buffer implementation
103
+ 5. **Neural Network** - Dueling architecture build
104
+ 6. **Training Algorithm** - DQN training loop
105
+ 7. **Complete Implementation** - Full system integration
106
+ 8. **Exercises** - Hands-on coding challenges
107
+
108
+ ### 8 Interactive Exercises:
109
+
110
+ 1. Replay Memory Implementation
111
+ 2. Dueling DQN Model Architecture
112
+ 3. Environment Wrapper
113
+ 4. Training Loop with Epsilon-Greedy
114
+ 5. Reward Shaping Functions
115
+ 6. Model Saving/Loading System
116
+ 7. Hyperparameter Configuration
117
+ 8. Evaluation Metrics System
118
+
119
+ ## 🎮 Environment Details
120
+
121
+ **Game**: Super Mario Bros (NES)
122
+ **Action Space**: 12 complex movements
123
+ **Observation**: 4 stacked frames (84x84 grayscale)
124
+ **Reward Structure**: Distance, coins, enemies, level completion
125
+
126
+ ### Action Space (COMPLEX_MOVEMENT):
127
+ ```python
128
+ ['NOOP', 'RIGHT', 'RIGHT+A', 'RIGHT+B', 'RIGHT+A+B',
129
+ 'A', 'LEFT', 'LEFT+A', 'LEFT+B', 'LEFT+A+B',
130
+ 'DOWN', 'UP']
131
+ ```
132
+
133
+ ## 🧠 Dueling DQN Theory
134
+
135
+ ### Key Innovation:
136
+ ```python
137
+ Q(s,a) = V(s) + A(s,a) - mean(A(s,·))
138
+ ```
139
+
140
+ **Benefits over Standard DQN**:
141
+ - Better action generalization
142
+ - More stable learning
143
+ - Faster convergence
144
+ - Separate state value and action advantage learning
145
+
146
+ ## ⚙️ Training Configuration
147
+
148
+ ```python
149
+ # Default Hyperparameters
150
+ learning_rate = 0.0001
151
+ gamma = 0.99
152
+ batch_size = 32
153
+ buffer_size = 10000
154
+ epsilon_start = 1.0
155
+ epsilon_end = 0.01
156
+ epsilon_decay = 0.995
157
+ target_update = 1000
158
+ ```
159
+
160
+ ## 📊 Performance
161
+
162
+ ### Expected Learning Progress:
163
+ - **Episodes 0-1000**: Basic movement learning
164
+ - **Episodes 1000-5000**: Enemy avoidance and coin collection
165
+ - **Episodes 5000+**: Level navigation and completion
166
+
167
+ ### Sample Training Output:
168
+ ```
169
+ cuda | Episode: 100 | Score: 256.8 | Loss: 1.23 | Stage: 1-1
170
+ cuda | Episode: 500 | Score: 512.1 | Loss: 0.87 | Stage: 1-2
171
+ cuda | Episode: 1000 | Score: 890.4 | Loss: 0.45 | Stage: 2-1
172
+ ```
173
+
174
+ ## 🛠️ Usage Examples
175
+
176
+ ### Running the Tutorial:
177
+ ```python
178
+ from duel_dqn_tutorial import DuelingDQNTutorialApp
179
+ import sys
180
+ from PyQt5.QtWidgets import QApplication
181
+
182
+ app = QApplication(sys.argv)
183
+ window = DuelingDQNTutorialApp()
184
+ window.show()
185
+ sys.exit(app.exec_())
186
+ ```
187
+
188
+ ### Training a Model:
189
+ ```python
190
+ from mario_dqn import MarioDQNAgent
191
+
192
+ agent = MarioDQNAgent()
193
+ scores = agent.train(episodes=10000)
194
+ agent.save_model('mario_dqn_final.pth')
195
+ ```
196
+
197
+ ## 🎯 Educational Value
198
+
199
+ This tutorial helps you understand:
200
+
201
+ - **Reinforcement Learning Fundamentals**: MDP, Q-learning, policy optimization
202
+ - **Deep Q-Networks**: Value approximation with neural networks
203
+ - **Dueling Architecture**: Value/advantage decomposition theory
204
+ - **Experience Replay**: Importance of uncorrelated training samples
205
+ - **Target Networks**: Stabilizing training with delayed updates
206
+ - **Reward Engineering**: Shaping rewards for better learning
207
+ - **Hyperparameter Tuning**: Systematic configuration optimization
208
+
209
+ ## 📁 Project Structure
210
+
211
+ ```
212
+ dueling-dqn-mario-tutorial/
213
+ ├── duel_dqn_tutorial.py # Main PyQt5 application
214
+ ├── mario_dqn.py # DQN implementation
215
+ ├── wrappers.py # Environment wrappers
216
+ ├── models/ # Saved model checkpoints
217
+ ├── exercises/ # Exercise solutions
218
+ ├── requirements.txt # Dependencies
219
+ └── README.md # This file
220
+ ```
221
+
222
+ ## 🤝 Contributing
223
+
224
+ We welcome contributions! Areas for improvement:
225
+
226
+ - Additional exercise variations
227
+ - More visualization tools
228
+ - Performance optimizations
229
+ - Additional game environments
230
+ - Multi-agent implementations
231
+
232
+ ## 📜 Citation
233
+
234
+ If you use this tutorial in your research or teaching, please cite:
235
+
236
+ ```bibtex
237
+ @software{dueling_dqn_mario_tutorial,
238
+ title = {PyQt5 Dueling DQN Mario Tutorial},
239
+ author = {Martin Rivera},
240
+ year = {2025},
241
+ url = {https://huggingface.co/TroglodyteDerivations/Interactive_Dueling_DQN_Mario_Tutorial/edit/main/README.md}
242
+ }
243
+ ```
244
+
245
+ ## 📄 License
246
+
247
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
248
+
249
+ ## 🙏 Acknowledgments
250
+
251
+ - Nintendo for Super Mario Bros
252
+ - OpenAI Gym for the reinforcement learning framework
253
+ - PyTorch team for the deep learning framework
254
+ - PyQt5 team for the GUI framework
255
+ - Flux.1-krea.dev for architecture visualizations
256
+
257
+ ---
258
+
259
+ **Happy Learning!** 🎮✨
260
+
261
+ *Master reinforcement learning by building an AI that can play Super Mario Bros!*
262
+ ```
263
+
264
+ ## Additional Files for Your Repository:
265
+
266
+ ### requirements.txt
267
+ ```txt
268
+ torch>=1.9.0
269
+ gym-super-mario-bros>=7.3.0
270
+ nes-py>=8.1.0
271
+ PyQt5>=5.15.0
272
+ numpy>=1.21.0
273
+ opencv-python>=4.5.0
274
+ matplotlib>=3.5.0
275
+ Pillow>=8.3.0
276
+ pygame>=2.0.0
277
+ ```
278
+
279
+ ### README.md (Simplified version)
280
+ ```markdown
281
+ # PyQt5 Dueling DQN Mario Tutorial
282
+
283
+ An interactive desktop application that teaches Dueling Deep Q-Networks through Super Mario Bros implementation.
284
+
285
+ ## Quick Start
286
+ ```bash
287
+ pip install -r requirements.txt
288
+ python duel_dqn_tutorial.py
289
+ ```
290
+
291
+ ## Features
292
+ - Interactive PyQt5 GUI
293
+ - 8 comprehensive tutorial sections
294
+ - Hands-on coding exercises
295
+ - Progress tracking
296
+ - Visual learning aids
297
+
298
+ ## License
299
+ MIT
300
+ ```
301
+
302
+ This model card provides comprehensive documentation for your educational application and follows Hugging Face's best practices for model documentation. It clearly communicates that this is an educational tool rather than a traditional pre-trained model, while still providing all the necessary information for users to understand and use your application effectively.