Mini RL Game — DQN (Vector + Pixels)

A simple Pygame environment with a DQN agent that learns two scenarios. It is a Educational RL example. Quick experimentation with DQN on a minimal game.

  • Eat: Catch falling objects.
  • Avoid: Dodge falling objects as long as possible.

Check the project from the GitHub Link. You can download the models here.

Observation Types

  • Vector (MLP): Compact state per enemy with normalized deltas.
  • Pixels (CNN): Raw frames (84×84 grayscale) stacked over 4 frames.

✅ Checkpoints

Vector (MLP)

Scenario Episodes Enemies File
Eat 1000 4 model_vector_eat.h5
Avoid 3000 8 model_vector_avoid.h5

Pixels (CNN)

Scenario Episodes Enemies File
Eat 1000 4 model_pixels_eat.h5
Avoid 3000 8 model_pixels_avoid.h5

🧠 Model Architecture

Vector (MLP) DQN

  • Input: 2 * N_enemies features (per enemy: Δx/width, Δy/height).
  • Network:
    Dense(128, relu) → Dense(128, relu) → Dense(3, linear)

Pixels (CNN) DQN

  • Input: (84, 84, 4) stacked grayscale frames.
  • Network:
    Conv(32, 8×8, s=4, relu) → Conv(64, 4×4, s=2, relu) → Conv(64, 3×3, s=1, relu) → Dense(512, relu) → Dense(3, linear)

⚙️ Training Setup

Algorithm: DQN with target network
Loss: Huber
Optimizer: Adam (lr=1e-3 for MLP, lr=2.5e-4 for CNN)
Target Updates: Soft update with τ=0.005

Replay

  • Buffer size: 50k (MLP) / 100k (CNN)
  • Warm-up (train_start): 2000 (MLP) / 5000 (CNN)
  • Updates per env step: 2
  • Batch size: 64–128 (typical)

Exploration

  • Linear epsilon decay per episode: 1.0 → 0.05 over ~750 episodes.

Rewards (scaled small for stability)

  • Eat: step −0.01, catch +1.0, miss −1.0
  • Avoid: survival +0.001 per step, near-miss up to −0.25, collision −5.0

Environment

  • Pygame; player moves along bottom; multiple falling enemies.
  • Dependencies
    • Python 3.8
    • TensorFlow 2.x (e.g., 2.9)
    • NumPy
    • scikit-image (for pixels preprocessing)
    • Pygame
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading