Car-Racing-Agent / game /README.md
nirmalpratheep's picture
Upload 7 files
de9fc8c verified

A newer version of the Gradio SDK is available: 6.14.0

Upgrade

Game

Pygame-based curriculum car racer used as the simulation backend for RL training.

Running

# From project root
uv run python main.py          # start at track 1
uv run python main.py 5        # start at track 5

# As a module
uv run python -m game.curriculum_game 5

Controls

Key Action
Arrow keys Drive (up=throttle, down=brake, left/right=steer)
N / P Next / previous track
1 – 9 Jump to track number
R Restart (counts as an attempt)
ESC Quit

Tracks

16 tracks across 4 difficulty tiers. Each level narrows the road, tightens turns, or increases speed cap.

Tier Levels Shape Description
A β€” Easy 1–4 Full ellipses Wide to narrow ovals, superspeedway
B β€” Medium-Easy 5–8 Rounded rectangles Stadium oval, tight rectangle
C β€” Medium-Hard 9–12 Two-arc layouts Hairpin, chicane, double-hairpin, asymmetric
D β€” Hard 13–16 Polygon circuits L-shape, T-notch, complex circuit, master challenge

Game Rules

  • Complete one full lap without touching the white fence border.
  • Touching the fence or pressing R = restart from start, attempt counter +1.
  • Cross the start/finish line cleanly to finish. A summary screen shows stats.

HUD

A single top bar shows: track name Β· speed Β· attempt count Β· lap time Β· total time Β· distance Β· max speed. Timer starts on first key press, not on load.

File Structure

game/
  oval_racer.py        Original single-oval game. Exports draw_headlights, draw_car,
                       SCREEN_W, SCREEN_H used by curriculum_game and env/.
  tracks.py            16 TrackDef objects. Each knows its waypoints, road width,
                       start position/angle, speed cap, and on-track mask.
  curriculum_game.py   Main playable game. RaceState drives the lap logic,
                       reset-on-crash, finish detection, and HUD rendering.
  rl_splits.py         CarEnv (gym-style wrapper), CurriculumSampler, Evaluator,
                       and TRAIN / VAL / TEST splits for RL training.
  test_tracks.py       Headless test: builds all 16 tracks and simulates 150 steps
                       each. Run with: uv run python -m game.test_tracks

Physics Constants

Constant Value Effect
ACCEL 0.13 Throttle acceleration per frame
BRAKE_DECEL 0.22 Braking deceleration per frame
FRICTION 0.038 Passive speed decay per frame
STEER_DEG 2.7 Degrees rotated per steer step
max_speed 3.0–4.5 Per-track speed cap (px/frame)

Speed is in px/frame. Multiply by FPS (60) to get px/s shown in HUD.

Finish Line Detection

Two-phase gate crossing to handle fast cars reliably:

  1. Arm β€” wait until gate_side > 50 px ahead (car is clearly past the gate going forward).
  2. Trigger β€” detect prev_side < 0 and curr_side >= 0 with speed > 0.3.

This prevents the car from triggering on spawn or when reversing back over the line.

Track Metadata (used by reward)

Each TrackDef computes three values at construction time:

Field Formula Purpose
optimal_dist Waypoint polygon perimeter (px) Theoretical shortest lap path
par_time_steps optimal_dist / (max_speed Γ— 0.7) Expected lap frames at 70% speed
complexity (115 / width) Γ— (max_speed / 3.0) Difficulty multiplier (1.0 β†’ 3.45)

RL Interface (rl_splits.py)

CarEnv exposes a gym-style API:

from game.rl_splits import make_env, TRAIN

env = make_env(TRAIN[0])
obs = env.reset()        # [x/W, y/H, sin, cos, speed/max, on_track, gate_side]
obs, reward, done, info = env.step([accel, steer])
# info keys: lap, on_track, step, crashes, lap_dist, out_of_bounds

Reward Function

Rewards are not scaled by complexity β€” all values are fixed and comparable across every track. Complexity only scales the curriculum threshold.

Term Trigger Value Purpose
Forward pulse Every step +speed/max_speed Γ— 0.01 Prevent stalling
Off-track Every step off road βˆ’0.5 Stay on road
Crash event onβ†’off transition βˆ’5.0 Penalise each boundary hit
Lap completion Gate crossed cleanly +50 Γ— time_ratio Γ— dist_ratio Fast + efficient path
Out of bounds Terminal βˆ’100 Don't leave screen

Lap completion breakdown:

time_ratio = clamp(par_time_steps / actual_lap_steps,  0.5, 2.0)
dist_ratio = clamp(optimal_dist   / actual_lap_dist,   0.5, 1.0)
  • dist_ratio capped at 1.0 β€” no bonus for paths shorter than the centreline (any such path involves off-track corner cutting). lap_dist is only accumulated while on_track=True, closing the corner-cutting exploit.
  • Best lap: 50 Γ— 2.0 Γ— 1.0 = 100
  • Worst completed lap: 50 Γ— 0.5 Γ— 0.5 = 12.5

Curriculum threshold scales with complexity, rewards do not:

effective_threshold = base_threshold Γ— track.complexity
Track C Effective threshold (base=30)
1 β€” Wide Oval 1.00 30
8 β€” Small Oval 2.03 61
14 β€” T-Notch 2.66 80
16 β€” Master Challenge 3.45 104