Spaces:

nirmalpratheep
/

Car-Racing-Agent

Sleeping

App Files Files Community

nirmalpratheep commited on Apr 22

Commit

de9fc8c

verified ·

1 Parent(s): 41a9651

Upload 7 files

Browse files

Files changed (7) hide show

game/README.md +144 -0
game/__init__.py +1 -0
game/curriculum_game.py +405 -0
game/oval_racer.py +246 -0
game/rl_splits.py +625 -0
game/test_tracks.py +83 -0
game/tracks.py +397 -0

game/README.md ADDED Viewed

	@@ -0,0 +1,144 @@

+# Game
+Pygame-based curriculum car racer used as the simulation backend for RL training.
+## Running
+```bash
+# From project root
+uv run python main.py          # start at track 1
+uv run python main.py 5        # start at track 5
+# As a module
+uv run python -m game.curriculum_game 5
+```
+## Controls
+| Key | Action |
+|-----|--------|
+| Arrow keys | Drive (up=throttle, down=brake, left/right=steer) |
+| N / P | Next / previous track |
+| 1 – 9 | Jump to track number |
+| R | Restart (counts as an attempt) |
+| ESC | Quit |
+## Tracks
+16 tracks across 4 difficulty tiers. Each level narrows the road, tightens turns, or increases speed cap.
+| Tier | Levels | Shape | Description |
+|------|--------|-------|-------------|
+| A — Easy | 1–4 | Full ellipses | Wide to narrow ovals, superspeedway |
+| B — Medium-Easy | 5–8 | Rounded rectangles | Stadium oval, tight rectangle |
+| C — Medium-Hard | 9–12 | Two-arc layouts | Hairpin, chicane, double-hairpin, asymmetric |
+| D — Hard | 13–16 | Polygon circuits | L-shape, T-notch, complex circuit, master challenge |
+## Game Rules
+- Complete **one full lap** without touching the white fence border.
+- Touching the fence **or** pressing R = restart from start, attempt counter +1.
+- Cross the start/finish line cleanly to finish. A summary screen shows stats.
+## HUD
+A single top bar shows: track name · speed · attempt count · lap time · total time · distance · max speed. Timer starts on first key press, not on load.
+## File Structure
+```
+game/
+  oval_racer.py        Original single-oval game. Exports draw_headlights, draw_car,
+                       SCREEN_W, SCREEN_H used by curriculum_game and env/.
+  tracks.py            16 TrackDef objects. Each knows its waypoints, road width,
+                       start position/angle, speed cap, and on-track mask.
+  curriculum_game.py   Main playable game. RaceState drives the lap logic,
+                       reset-on-crash, finish detection, and HUD rendering.
+  rl_splits.py         CarEnv (gym-style wrapper), CurriculumSampler, Evaluator,
+                       and TRAIN / VAL / TEST splits for RL training.
+  test_tracks.py       Headless test: builds all 16 tracks and simulates 150 steps
+                       each. Run with: uv run python -m game.test_tracks
+```
+## Physics Constants
+| Constant | Value | Effect |
+|----------|-------|--------|
+| `ACCEL` | 0.13 | Throttle acceleration per frame |
+| `BRAKE_DECEL` | 0.22 | Braking deceleration per frame |
+| `FRICTION` | 0.038 | Passive speed decay per frame |
+| `STEER_DEG` | 2.7 | Degrees rotated per steer step |
+| `max_speed` | 3.0–4.5 | Per-track speed cap (px/frame) |
+Speed is in px/frame. Multiply by FPS (60) to get px/s shown in HUD.
+## Finish Line Detection
+Two-phase gate crossing to handle fast cars reliably:
+1. **Arm** — wait until `gate_side > 50 px` ahead (car is clearly past the gate going forward).
+2. **Trigger** — detect `prev_side < 0` and `curr_side >= 0` with `speed > 0.3`.
+This prevents the car from triggering on spawn or when reversing back over the line.
+## Track Metadata (used by reward)
+Each `TrackDef` computes three values at construction time:
+| Field | Formula | Purpose |
+|-------|---------|---------|
+| `optimal_dist` | Waypoint polygon perimeter (px) | Theoretical shortest lap path |
+| `par_time_steps` | `optimal_dist / (max_speed × 0.7)` | Expected lap frames at 70% speed |
+| `complexity` | `(115 / width) × (max_speed / 3.0)` | Difficulty multiplier (1.0 → 3.45) |
+## RL Interface (`rl_splits.py`)
+`CarEnv` exposes a gym-style API:
+```python
+from game.rl_splits import make_env, TRAIN
+env = make_env(TRAIN[0])
+obs = env.reset()        # [x/W, y/H, sin, cos, speed/max, on_track, gate_side]
+obs, reward, done, info = env.step([accel, steer])
+# info keys: lap, on_track, step, crashes, lap_dist, out_of_bounds
+```
+### Reward Function
+Rewards are **not** scaled by complexity — all values are fixed and comparable
+across every track. Complexity only scales the curriculum `threshold`.
+| Term | Trigger | Value | Purpose |
+|------|---------|-------|---------|
+| Forward pulse | Every step | `+speed/max_speed × 0.01` | Prevent stalling |
+| Off-track | Every step off road | `−0.5` | Stay on road |
+| Crash event | on→off transition | `−5.0` | Penalise each boundary hit |
+| Lap completion | Gate crossed cleanly | `+50 × time_ratio × dist_ratio` | Fast + efficient path |
+| Out of bounds | Terminal | `−100` | Don't leave screen |
+**Lap completion breakdown:**
+```
+time_ratio = clamp(par_time_steps / actual_lap_steps,  0.5, 2.0)
+dist_ratio = clamp(optimal_dist   / actual_lap_dist,   0.5, 1.0)
+```
+- `dist_ratio` capped at **1.0** — no bonus for paths shorter than the centreline
+  (any such path involves off-track corner cutting). `lap_dist` is only
+  accumulated while `on_track=True`, closing the corner-cutting exploit.
+- Best lap: `50 × 2.0 × 1.0 = 100`
+- Worst completed lap: `50 × 0.5 × 0.5 = 12.5`
+**Curriculum threshold scales with complexity, rewards do not:**
+```
+effective_threshold = base_threshold × track.complexity
+```
+| Track | C | Effective threshold (base=30) |
+|-------|---|-------------------------------|
+| 1 — Wide Oval | 1.00 | 30 |
+| 8 — Small Oval | 2.03 | 61 |
+| 14 — T-Notch | 2.66 | 80 |
+| 16 — Master Challenge | 3.45 | 104 |

game/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ from .oval_racer import SCREEN_W, SCREEN_H, draw_headlights, draw_car

game/curriculum_game.py ADDED Viewed

	@@ -0,0 +1,405 @@

+"""
+Curriculum Car Racer — One-lap challenge.
+Rules:
+  * Complete one full lap without touching the fence (white border).
+  * Touching the fence OR pressing R = OUT -> restart from start, attempt +1.
+  * Trace path and distance covered reset on every restart.
+  * Game ends when the finish line (= start line) is crossed cleanly.
+Controls:
+  Arrow keys  drive
+  N / P       next / prev track
+  1-9         jump to track 1-9
+  R           manual restart (counts as an attempt)
+  ESC         quit
+"""
+import math
+import pygame
+from .oval_racer import SCREEN_W, SCREEN_H, draw_headlights, draw_car
+from .tracks import TRACKS
+FPS = 60
+ACCEL       = 0.13
+BRAKE_DECEL = 0.22
+FRICTION    = 0.038
+STEER_DEG   = 2.7
+C_YELLOW = (255, 215,   0)
+C_HUD    = (230, 230, 230)
+C_GREEN  = ( 50, 220,  80)
+C_BLUE   = ( 60, 140, 255)
+RACING = "racing"
+DONE   = "done"
+PATH_SAMPLE_EVERY = 2   # record a path point every N frames
+# ── Car ──────────────────────────────────────────────────────────────────────
+class Car:
+    def __init__(self, track):
+        self.track = track
+        self.reset()
+    def reset(self):
+        self.x     = float(self.track.start_pos[0])
+        self.y     = float(self.track.start_pos[1])
+        self.angle = float(self.track.start_angle)
+        self.speed = 0.0
+    def update(self, accel, steer):
+        ms    = self.track.max_speed
+        ratio = min(abs(self.speed) / ms, 1.0) if ms > 0 else 0.0
+        self.angle += steer * STEER_DEG * max(0.3, ratio)
+        if accel > 0:
+            self.speed = min(self.speed + ACCEL, ms)
+        elif accel < 0:
+            self.speed = max(self.speed - BRAKE_DECEL, -ms * 0.4)
+        if self.speed > 0:
+            self.speed = max(0.0, self.speed - FRICTION)
+        elif self.speed < 0:
+            self.speed = min(0.0, self.speed + FRICTION)
+        rad = math.radians(self.angle)
+        self.x += self.speed * math.cos(rad)
+        self.y += self.speed * math.sin(rad)
+# ── Drawing ───────────────────────────────────────────────────────────────────
+_RAY_ANGLES = [-90, -45, 0, 45, 90]
+_RAY_MAX    = 120
+_RAY_STEP   = 2
+# Colour gradient: red (close) → yellow → green (far)
+def _ray_colour(ratio):
+    r = int(255 * (1 - ratio))
+    g = int(255 * ratio)
+    return (r, g, 0)
+def draw_raycasts(surf, track, car):
+    """
+    Draw the 5 RL observation rays from the car.
+    Each ray is coloured green (far) → red (close) to show clearance.
+    Press V in game to toggle.
+    """
+    overlay = pygame.Surface((SCREEN_W, SCREEN_H), pygame.SRCALPHA)
+    for rel_deg in _RAY_ANGLES:
+        abs_rad = math.radians(car.angle + rel_deg)
+        dx = math.cos(abs_rad) * _RAY_STEP
+        dy = math.sin(abs_rad) * _RAY_STEP
+        px, py = car.x, car.y
+        dist = 0.0
+        while dist < _RAY_MAX:
+            px += dx
+            py += dy
+            dist += _RAY_STEP
+            if not track.on_track(px, py):
+                break
+        ratio  = dist / _RAY_MAX
+        colour = _ray_colour(ratio) + (180,)
+        end_x  = car.x + math.cos(abs_rad) * dist
+        end_y  = car.y + math.sin(abs_rad) * dist
+        pygame.draw.line(overlay, colour, (int(car.x), int(car.y)),
+                         (int(end_x), int(end_y)), 1)   # 1px line — subtle
+        pygame.draw.circle(overlay, colour, (int(end_x), int(end_y)), 2)  # 2px dot
+    surf.blit(overlay, (0, 0))
+def _draw_path(surf, pts, color, width=2):
+    if len(pts) >= 2:
+        ipts = [(int(x), int(y)) for x, y in pts]
+        pygame.draw.lines(surf, color, False, ipts, width)
+def draw_hud(surf, track, car, race, fonts):
+    _, small = fonts
+    lt = race.lap_elapsed()
+    text = (
+        f"Lv{track.level}: {track.name}"
+        f"   Spd {abs(car.speed)*FPS:4.1f}"
+        f"   Attempt {race.attempts}"
+        f"   Lap {lt:.2f}s"
+        f"   Total {race.total_elapsed():.2f}s"
+        f"   Dist {race.current_distance:.0f}px"
+        f"   Max {race._max_spd:.1f}"
+        f"   |  Arrows=drive  N/P=track  1-9=jump  R=restart  V=rays  ESC=quit"
+    )
+    rendered = small.render(text, True, C_HUD)
+    bar_h = rendered.get_height() + 4
+    bar = pygame.Surface((SCREEN_W, bar_h), pygame.SRCALPHA)
+    bar.fill((0, 0, 0, 200))
+    surf.blit(bar, (0, 0))
+    surf.blit(rendered, (6, 2))
+def draw_summary(surf, race, fonts):
+    """Blocking summary overlay shown when finish line is crossed."""
+    font, small = fonts
+    big = pygame.font.SysFont("consolas", 38, bold=True)
+    med = pygame.font.SysFont("consolas", 22, bold=True)
+    overlay = pygame.Surface((SCREEN_W, SCREEN_H), pygame.SRCALPHA)
+    overlay.fill((0, 0, 0, 170))
+    surf.blit(overlay, (0, 0))
+    cx = SCREEN_W // 2
+    def centre(text, color, fnt, y):
+        s = fnt.render(text, True, color)
+        surf.blit(s, (cx - s.get_width() // 2, y))
+    centre("FINISH!", C_GREEN, big, 130)
+    pygame.draw.line(surf, (100, 100, 100), (cx - 220, 183), (cx + 220, 183), 1)
+    rows = [
+        ("Lap time",    f"{race.lap_time:.2f} s"),
+        ("Total time",  f"{race.total_time:.2f} s"),
+        ("Distance",    f"{race.lap_dist:.0f} px"),
+        ("Max speed",   f"{race.lap_max_spd:.1f} px/s"),
+        ("Avg speed",   f"{race.lap_avg_spd:.1f} px/s"),
+        ("Attempts",    str(race.attempts)),
+    ]
+    label_x = cx - 130
+    value_x = cx + 140
+    for i, (label, value) in enumerate(rows):
+        y = 196 + i * 28
+        surf.blit(med.render(label, True, (170, 170, 170)), (label_x, y))
+        surf.blit(med.render(value, True, C_HUD),
+                  (value_x - med.size(value)[0], y))
+    pygame.draw.line(surf, (100, 100, 100), (cx - 220, 368), (cx + 220, 368), 1)
+    verdict = "Perfect run - no restarts!" if race.attempts == 1 else \
+              f"Finished in {race.attempts} attempts"
+    centre(verdict, C_YELLOW, font, 378)
+    centre("R = retry   N/P = change track   ESC = quit",
+           (150, 150, 150), small, 418)
+# ── Race state ────────────────────────────────────────────────────────────────
+class RaceState:
+    def __init__(self, track):
+        self.track       = track
+        self.car         = Car(track)
+        self.state       = RACING
+        self.attempts    = 1
+        self.show_rays   = False   # toggled with V
+        self._lap_timer_started   = False  # starts on first key press per attempt
+        self._total_timer_started = False  # starts on first key press ever
+        self.total_start = None
+        self.lap_start   = None
+        self.lap_time    = 0.0   # locked on finish
+        self.lap_dist    = 0.0   # locked on finish
+        self.total_time  = 0.0   # locked on finish
+        self.lap_max_spd = 0.0   # locked on finish (px/s)
+        self.lap_avg_spd = 0.0   # locked on finish (px/s)
+        self.prev_side   = track.gate_side(self.car.x, self.car.y)
+        self._lap_armed  = False  # True once car is clearly past the gate
+        # Path trace + speed — cleared on every reset
+        self.current_path     = []
+        self.current_distance = 0.0   # px covered this attempt
+        self._max_spd         = 0.0   # peak speed this attempt (px/s)
+        self._spd_sum         = 0.0   # for rolling average
+        self._spd_count       = 0
+        self._frame           = 0
+        self._prev_x          = self.car.x
+        self._prev_y          = self.car.y
+    # ── helpers ──────────────────────────────────────────────────────────────
+    def lap_elapsed(self):
+        if not self._lap_timer_started:
+            return 0.0
+        return (pygame.time.get_ticks() - self.lap_start) / 1000.0
+    def total_elapsed(self):
+        if self.state == DONE:
+            return self.total_time
+        if not self._total_timer_started:
+            return 0.0
+        return (pygame.time.get_ticks() - self.total_start) / 1000.0
+    def _record(self):
+        """Accumulate distance + speed stats every frame; record path every N frames."""
+        dx = self.car.x - self._prev_x
+        dy = self.car.y - self._prev_y
+        self.current_distance += math.hypot(dx, dy)
+        self._prev_x, self._prev_y = self.car.x, self.car.y
+        pps = abs(self.car.speed) * FPS
+        if pps > self._max_spd:
+            self._max_spd = pps
+        self._spd_sum   += pps
+        self._spd_count += 1
+        self._frame += 1
+        if self._frame % PATH_SAMPLE_EVERY == 0:
+            self.current_path.append((self.car.x, self.car.y))
+    def _reset_attempt(self):
+        """Clear trace, distance, speed stats, and reset car to start."""
+        self.current_path     = []
+        self.current_distance = 0.0
+        self._max_spd         = 0.0
+        self._spd_sum         = 0.0
+        self._spd_count       = 0
+        self._frame           = 0
+        self.car.reset()
+        self._prev_x  = self.car.x
+        self._prev_y  = self.car.y
+        self.attempts   += 1
+        self.lap_start          = None
+        self._lap_timer_started = False
+        self.prev_side   = self.track.gate_side(self.car.x, self.car.y)
+        self._lap_armed  = False
+    def manual_reset(self):
+        self._reset_attempt()
+    # ── main step ─────────────────��───────────────────────────────────────────
+    def step(self, accel, steer):
+        if self.state == DONE:
+            return
+        if (accel != 0 or steer != 0) and not self._lap_timer_started:
+            now = pygame.time.get_ticks()
+            self.lap_start = now
+            self._lap_timer_started = True
+            if not self._total_timer_started:
+                self.total_start = now
+                self._total_timer_started = True
+        self.car.update(accel, steer)
+        self._record()
+        # Fence hit → OUT, clear trace
+        if not self.track.on_track(self.car.x, self.car.y):
+            self._reset_attempt()
+            return
+        # Finish line detection (same line as start).
+        # Phase 1 — arm: wait until car is 50 px ahead of gate going forward.
+        # Phase 2 — trigger: detect when gate_side crosses from negative → positive.
+        # This avoids the < -5 threshold bug where fast cars skip the window.
+        curr_side = self.track.gate_side(self.car.x, self.car.y)
+        if not self._lap_armed and curr_side > 50:
+            self._lap_armed = True
+        if self._lap_armed and self.prev_side < 0 and curr_side >= 0 and self.car.speed > 0.3:
+            self.lap_time    = self.lap_elapsed()
+            self.lap_dist    = self.current_distance
+            self.total_time  = self.total_elapsed()
+            self.lap_max_spd = self._max_spd
+            self.lap_avg_spd = (self._spd_sum / self._spd_count
+                                if self._spd_count else 0.0)
+            self.state       = DONE
+        self.prev_side = curr_side
+    # ── draw ──────────────────────────────────────────────────────────────────
+    def draw(self, surf, fonts):
+        surf.blit(self.track.surface, (0, 0))
+        # Current attempt path in blue (cleared after every reset)
+        _draw_path(surf, self.current_path, C_BLUE, width=2)
+        draw_headlights(surf, self.car.x, self.car.y, self.car.angle)
+        if self.show_rays:
+            draw_raycasts(surf, self.track, self.car)
+        draw_car(surf, self.car.x, self.car.y, self.car.angle)
+        if self.state == RACING:
+            draw_hud(surf, self.track, self.car, self, fonts)
+        else:
+            draw_summary(surf, self, fonts)
+# ── Main loop ─────────────────────────────────────────────────────────────────
+def run(start_track=1):
+    pygame.init()
+    screen = pygame.display.set_mode((SCREEN_W, SCREEN_H))
+    clock  = pygame.time.Clock()
+    fonts  = (pygame.font.SysFont("consolas", 20, bold=True),
+              pygame.font.SysFont("consolas", 14))
+    track_idx = max(0, min(start_track - 1, len(TRACKS) - 1))
+    def new_race(idx):
+        t = TRACKS[idx]
+        t.build()
+        pygame.display.set_caption(f"Curriculum Racer  Lv{t.level}: {t.name}")
+        return RaceState(t)
+    race = new_race(track_idx)
+    running = True
+    while running:
+        clock.tick(FPS)
+        for event in pygame.event.get():
+            if event.type == pygame.QUIT:
+                running = False
+            if event.type == pygame.KEYDOWN:
+                if event.key == pygame.K_ESCAPE:
+                    running = False
+                elif event.key == pygame.K_r:
+                    # After finish: full retry (attempt counter resets to 1)
+                    # While racing: counts as an attempt
+                    if race.state == DONE:
+                        race = new_race(track_idx)
+                    else:
+                        race.manual_reset()
+                elif event.key == pygame.K_v:
+                    race.show_rays = not race.show_rays
+                elif event.key == pygame.K_n:
+                    track_idx = (track_idx + 1) % len(TRACKS)
+                    race = new_race(track_idx)
+                elif event.key == pygame.K_p:
+                    track_idx = (track_idx - 1) % len(TRACKS)
+                    race = new_race(track_idx)
+                else:
+                    for ki, key in enumerate([
+                        pygame.K_1, pygame.K_2, pygame.K_3, pygame.K_4,
+                        pygame.K_5, pygame.K_6, pygame.K_7, pygame.K_8, pygame.K_9
+                    ]):
+                        if event.key == key and ki < len(TRACKS):
+                            track_idx = ki
+                            race = new_race(track_idx)
+                            break
+        if race.state == RACING:
+            keys  = pygame.key.get_pressed()
+            accel = (1 if keys[pygame.K_UP]    else 0) - (1 if keys[pygame.K_DOWN]  else 0)
+            steer = (1 if keys[pygame.K_RIGHT]  else 0) - (1 if keys[pygame.K_LEFT]  else 0)
+            race.step(accel, steer)
+        race.draw(screen, fonts)
+        pygame.display.flip()
+    pygame.quit()
+if __name__ == "__main__":
+    import sys
+    level = int(sys.argv[1]) if len(sys.argv) > 1 else 1
+    run(start_track=level)

game/oval_racer.py ADDED Viewed

	@@ -0,0 +1,246 @@

+"""
+Oval Car Racer
+Controls: Arrow keys to drive, R to reset, ESC to quit.
+"""
+import math
+import pygame
+# ── Screen ──────────────────────────────────────────────────────────────────
+SCREEN_W, SCREEN_H = 900, 600
+FPS = 60
+# ── Oval geometry ────────────────────────────────────────────────────────────
+CX, CY   = SCREEN_W // 2, SCREEN_H // 2
+OUTER_RX, OUTER_RY = 380, 240
+INNER_RX, INNER_RY = 290, 155
+MID_RY   = (OUTER_RY + INNER_RY) // 2   # 197
+START_X  = float(CX)
+START_Y  = float(CY + MID_RY)
+# ── Colours ─────────────────────────────────────────────────────────────────
+C_GRASS  = ( 45, 110,  45)
+C_TRACK  = ( 52,  52,  52)
+C_WHITE  = (255, 255, 255)
+C_YELLOW = (255, 215,   0)
+C_CAR    = (220,  50,  50)
+C_WIND   = (160, 210, 255)
+C_HUD    = (230, 230, 230)
+C_WARN   = (255,  70,  70)
+# ── Car physics ──────────────────────────────────────────────────────────────
+MAX_SPEED   = 4.5
+ACCEL       = 0.13
+BRAKE_DECEL = 0.22
+FRICTION    = 0.038
+STEER_DEG   = 2.7
+# ────────────────────────────────────────────────────────────────────────────
+# Track geometry
+# ────────────────────────────────────────────────────────────────────────────
+def _in_ellipse(x, y, cx, cy, rx, ry):
+    return ((x - cx) / rx) ** 2 + ((y - cy) / ry) ** 2 <= 1.0
+def on_track(x, y):
+    return (_in_ellipse(x, y, CX, CY, OUTER_RX, OUTER_RY) and
+            not _in_ellipse(x, y, CX, CY, INNER_RX, INNER_RY))
+def _rect(cx, cy, rx, ry):
+    return pygame.Rect(cx - rx, cy - ry, rx * 2, ry * 2)
+# ────────────────────────────────────────────────────────────────────────────
+# Drawing
+# ────────────────────────────────────────────────────────────────────────────
+def build_track_surface():
+    surf = pygame.Surface((SCREEN_W, SCREEN_H))
+    surf.fill(C_GRASS)
+    # Tarmac
+    pygame.draw.ellipse(surf, C_TRACK, _rect(CX, CY, OUTER_RX, OUTER_RY))
+    pygame.draw.ellipse(surf, C_GRASS, _rect(CX, CY, INNER_RX, INNER_RY))
+    # White borders
+    bw = 3
+    pygame.draw.ellipse(surf, C_WHITE, _rect(CX, CY, OUTER_RX, OUTER_RY), bw)
+    pygame.draw.ellipse(surf, C_WHITE, _rect(CX, CY, INNER_RX, INNER_RY), bw)
+    # Finish line — vertical white line at bottom of track
+    line_y = CY + MID_RY
+    track_w = OUTER_RX - INNER_RX
+    line_x  = CX
+    pygame.draw.line(surf, C_WHITE, (line_x, line_y - track_w // 2), (line_x, line_y + track_w // 2), 3)
+    return surf
+def draw_headlights(surf, x, y, angle_deg):
+    CONE_LEN  = 60           # pixels ahead
+    HALF_ANG  = 30           # half of 60-degree spread
+    STEPS     = 12           # arc smoothness
+    # Build cone polygon: origin + arc points
+    pts = [(x, y)]
+    for i in range(STEPS + 1):
+        a = math.radians(angle_deg - HALF_ANG + (2 * HALF_ANG) * i / STEPS)
+        pts.append((x + math.cos(a) * CONE_LEN,
+                    y + math.sin(a) * CONE_LEN))
+    # Draw on alpha surface so it blends with track
+    cone = pygame.Surface((SCREEN_W, SCREEN_H), pygame.SRCALPHA)
+    pygame.draw.polygon(cone, (255, 255, 180, 160), pts)  # yellow fill — visible to CNN
+    pygame.draw.lines(cone, (255, 255, 200, 220), False, pts[1:], 2)  # bright edge
+    surf.blit(cone, (0, 0))
+def draw_car(surf, x, y, angle_deg):
+    w, h = 26, 12
+    img  = pygame.Surface((w, h), pygame.SRCALPHA)
+    pygame.draw.rect(img, C_CAR,  (0, 0, w, h),         border_radius=3)
+    pygame.draw.rect(img, C_WIND, (w - 9, 2, 7, h - 4), border_radius=2)
+    pygame.draw.rect(img, (255, 200, 0), (w - 3, 3, 3, h - 6))
+    rot = pygame.transform.rotate(img, -angle_deg)
+    surf.blit(rot, rot.get_rect(center=(int(x), int(y))))
+def draw_hud(surf, speed, lap, best, last, off_track, lap_done):
+    font  = pygame.font.SysFont("consolas", 20, bold=True)
+    small = pygame.font.SysFont("consolas", 15)
+    panel = pygame.Surface((240, 85), pygame.SRCALPHA)
+    panel.fill((0, 0, 0, 170))
+    surf.blit(panel, (10, 10))
+    def put(text, color, row):
+        surf.blit(font.render(text, True, color), (18, 16 + row * 28))
+    put(f"Speed : {abs(speed) * 65:5.1f} km/h", C_HUD, 0)
+    put(f"Lap   : {lap}", C_HUD, 1)
+    best_s = f"{best:.2f}s" if best < 1e8 else "--"
+    last_s = f"{last:.2f}s" if last < 1e8 else "--"
+    put(f"Last:{last_s}  Best:{best_s}", C_HUD, 2)
+    if off_track:
+        msg = font.render("! OFF TRACK !", True, C_WARN)
+        surf.blit(msg, (SCREEN_W // 2 - msg.get_width() // 2, 12))
+    if lap_done:
+        msg = font.render("LAP COMPLETE!", True, C_YELLOW)
+        surf.blit(msg, (SCREEN_W // 2 - msg.get_width() // 2, 52))
+    hint = small.render("Arrows = drive    R = reset    ESC = quit", True, (150, 150, 150))
+    surf.blit(hint, (SCREEN_W // 2 - hint.get_width() // 2, SCREEN_H - 22))
+# ────────────────────────────────────────────────────────────────────────────
+# Car
+# ────────────────────────────────────────────────────────────────────────────
+class Car:
+    def __init__(self):
+        self.reset()
+    def reset(self):
+        self.x     = START_X
+        self.y     = START_Y
+        self.angle = 180.0
+        self.speed = 0.0
+    def update(self, accel, steer):
+        speed_ratio = min(abs(self.speed) / MAX_SPEED, 1.0)
+        self.angle += steer * STEER_DEG * max(0.3, speed_ratio)
+        if accel > 0:
+            self.speed = min(self.speed + ACCEL, MAX_SPEED)
+        elif accel < 0:
+            self.speed = max(self.speed - BRAKE_DECEL, -MAX_SPEED * 0.4)
+        if self.speed > 0:
+            self.speed = max(0.0, self.speed - FRICTION)
+        elif self.speed < 0:
+            self.speed = min(0.0, self.speed + FRICTION)
+        rad = math.radians(self.angle)
+        self.x += self.speed * math.cos(rad)
+        self.y += self.speed * math.sin(rad)
+        if not on_track(self.x, self.y):
+            self.speed *= 0.80
+# ────────────────────────────────────────────────────────────────────────────
+# Main
+# ────────────────────────────────────────────────────────────────────────────
+def main():
+    pygame.init()
+    screen = pygame.display.set_mode((SCREEN_W, SCREEN_H))
+    pygame.display.set_caption("Oval Car Racer")
+    clock  = pygame.time.Clock()
+    track_surf = build_track_surface()
+    car        = Car()
+    lap       = 0
+    best_time = float("inf")
+    last_time = float("inf")
+    lap_start = pygame.time.get_ticks()
+    flash     = 0
+    prev_y    = car.y
+    running = True
+    while running:
+        clock.tick(FPS)
+        for event in pygame.event.get():
+            if event.type == pygame.QUIT:
+                running = False
+            if event.type == pygame.KEYDOWN:
+                if event.key == pygame.K_ESCAPE:
+                    running = False
+                if event.key == pygame.K_r:
+                    car.reset()
+                    prev_y    = car.y
+                    lap_start = pygame.time.get_ticks()
+                    flash     = 0
+        keys  = pygame.key.get_pressed()
+        accel = (1 if keys[pygame.K_UP]    else 0) - (1 if keys[pygame.K_DOWN]  else 0)
+        steer = (1 if keys[pygame.K_RIGHT] else 0) - (1 if keys[pygame.K_LEFT] else 0)
+        car.update(accel, steer)
+        # Lap: car crosses start/finish line (y ~ START_Y) moving left, near CX
+        near_x    = abs(car.x - CX) < (OUTER_RX - INNER_RX) // 2 + 10
+        crossed   = prev_y < START_Y <= car.y   # crossed going downward
+        if near_x and crossed and car.speed > 0.5:
+            lap      += 1
+            elapsed   = (pygame.time.get_ticks() - lap_start) / 1000.0
+            last_time = elapsed
+            best_time = min(best_time, elapsed)
+            lap_start = pygame.time.get_ticks()
+            flash     = FPS * 2
+        prev_y = car.y
+        if flash > 0:
+            flash -= 1
+        screen.blit(track_surf, (0, 0))
+        draw_headlights(screen, car.x, car.y, car.angle)
+        draw_car(screen, car.x, car.y, car.angle)
+        draw_hud(screen, car.speed, lap, best_time, last_time,
+                 not on_track(car.x, car.y), flash > 0)
+        pygame.display.flip()
+    pygame.quit()
+if __name__ == "__main__":
+    main()

game/rl_splits.py ADDED Viewed

	@@ -0,0 +1,625 @@

+"""
+rl_splits.py — Curriculum tracks for RL training.
+10 tracks across 3 difficulty groups (all used for training):
+  Group A — Easy ovals         : tracks 1-4
+  Group B — Rectangular shapes : tracks 5-8
+  Group C — Hairpins & chicanes: tracks 9-10
+  TRAIN (10) : [1,2,3,4, 5,6,7,8, 9,10]  — curriculum progression easy→hard
+  VAL   (0)  : []
+  TEST  (0)  : []
+Training stops when the agent passes greedy eval on all 10 tracks simultaneously.
+Usage
+-----
+    from game.rl_splits import TRAIN, make_env, CurriculumSampler
+    sampler = CurriculumSampler(TRAIN)
+    while True:
+        env = make_env(sampler.sample())
+        reward = run_episode(env, agent)
+        sampler.record(reward)
+        if sampler.should_advance():
+            sampler.advance()
+"""
+import os
+import math
+import random
+import statistics
+from collections import deque
+import numpy as np
+# ── Lazy pygame initialisation (avoids import-time display requirement) ──────
+_pygame_ready = False
+def _ensure_pygame():
+    global _pygame_ready
+    if not _pygame_ready:
+        import pygame
+        if not pygame.get_init():
+            pygame.init()
+        _pygame_ready = True
+# ── Track splits ─────────────────────────────────────────────────────────────
+def _get_splits():
+    from .tracks import TRACKS          # TRACKS is 0-indexed, levels are 1-indexed
+    by_level = {t.level: t for t in TRACKS}
+    train_levels = [1, 2, 3, 4,  5, 6, 7, 8,  9, 10]   # all 10, easy→hard
+    val_levels   = []
+    test_levels  = []
+    train = [by_level[l] for l in train_levels]
+    val   = [by_level[l] for l in val_levels  ]
+    test  = [by_level[l] for l in test_levels ]
+    return train, val, test
+TRAIN, VAL, TEST = _get_splits()
+# Convenience: all tracks in curriculum order (for inspection / logging)
+ALL_ORDERED = sorted(TRAIN + VAL + TEST, key=lambda t: t.level)
+# ── Difficulty metadata ───────────────────────────────────────────────────────
+DIFFICULTY = {
+    "A-easy":        {"tracks": [1, 2, 3, 4], "description": "Full ovals"},
+    "B-medium-easy": {"tracks": [5, 6, 7, 8], "description": "Rectangular shapes"},
+    "C-medium-hard": {"tracks": [9, 10],       "description": "Hairpins & chicanes"},
+}
+def difficulty_of(track):
+    """Return the difficulty tier label for a track."""
+    for tier, info in DIFFICULTY.items():
+        if track.level in info["tracks"]:
+            return tier
+    return "unknown"
+# ── Environment factory ───────────────────────────────────────────────────────
+class CarEnv:
+    """
+    Minimal gym-style wrapper around TrackDef + Car physics.
+    Observation  (7 floats):
+        [angular_velocity, speed/max_speed, ray×5]
+        All from real sensors: gyroscope, speedometer, 5 proximity rays, camera image.
+        No map or waypoint information in the observation.
+    Action  (2 floats, each clamped to [-1, 1]):
+        [accel, steer]
+          accel  > 0 → accelerate,  < 0 → brake
+          steer  > 0 → right,        < 0 → left
+    Reward:
+        Per step
+          - 0.1                   base step penalty (efficiency pressure)
+          + (1+wp_cos)/2 * 2.0    dense heading alignment reward every step
+                                  (≈ +2 when aimed straight, 0 when perpendicular)
+          + (1+wp_cos)/2 * 20     bonus heading reward when advancing waypoints
+          - 10                    distance penalty when moving backward through
+                                  waypoints (moving away from target)
+        Terminal (episode ends immediately)
+          - 300   off track → done  (high penalty to strongly deter leaving track)
+          - 300   car leaves screen bounds
+          + 200   lap completed (target reached)
+        Complexity (track.complexity) scales the curriculum threshold only.
+    Done conditions:
+        * car leaves screen
+        * max_steps exceeded
+        * laps_target laps completed
+    """
+    # Physics (same as curriculum_game.py)
+    ACCEL       = 0.13
+    BRAKE_DECEL = 0.22
+    FRICTION    = 0.038
+    STEER_DEG   = 2.7
+    # Dense progress reward: one full lap of forward waypoint advances ≈ +15 total.
+    PROGRESS_SCALE = 15.0
+    def __init__(self, track, max_steps=3000, laps_target=3):
+        _ensure_pygame()
+        self.track = track
+        self.max_steps   = max_steps
+        self.laps_target = laps_target
+        track.build()
+        # Pre-compute waypoint arrays (numpy) for fast nearest-wp lookup.
+        # Waypoints are centreline points generated by TrackDef.build().
+        # Used only for the internal progress reward — NOT exposed in observations.
+        wps = track.waypoints
+        self._n_wps = len(wps)
+        self._wp_x = np.array([w[0] for w in wps], dtype=np.float32)
+        self._wp_y = np.array([w[1] for w in wps], dtype=np.float32)
+        self._progress_per_wp = self.PROGRESS_SCALE / self._n_wps
+        self._x = self._y = self._angle = self._speed = 0.0
+        self._prev_side   = 0.0
+        self._gate_armed  = False  # True once car is 50px past start line
+        self._laps        = 0
+        self._step        = 0
+        self._angle_delta = 0.0
+        self._wp_idx      = 0      # nearest centreline waypoint index
+        self._lap_dist    = 0.0
+        self._lap_prev_x  = 0.0
+        self._lap_prev_y  = 0.0
+        self._crash_count  = 0
+    # ── Public API ──────────────────────────────────────────────────────────
+    @property
+    def obs_size(self):
+        # angular_velocity, speed, ray×5
+        return 7
+    @property
+    def action_size(self):
+        return 2
+    @property
+    def laps(self):
+        return self._laps
+    def reset(self):
+        self._x     = float(self.track.start_pos[0])
+        self._y     = float(self.track.start_pos[1])
+        self._angle = float(self.track.start_angle)
+        self._speed = self.track.max_speed * 0.2
+        self._angle_delta  = 0.0
+        self._prev_side    = self.track.gate_side(self._x, self._y)
+        self._gate_armed   = False
+        self._laps         = 0
+        self._step         = 0
+        self._wp_idx       = self._nearest_wp(self._x, self._y)
+        self._lap_dist     = 0.0
+        self._lap_prev_x   = self._x
+        self._lap_prev_y   = self._y
+        self._crash_count  = 0
+        return self._obs()
+    def step(self, action):
+        accel = float(max(-1.0, min(1.0, action[0])))
+        steer = float(max(-1.0, min(1.0, action[1])))
+        prev_angle = self._angle
+        self._update_physics(accel, steer)
+        self._angle_delta = self._angle - prev_angle
+        self._step += 1
+        on        = self.track.on_track(self._x, self._y)
+        curr_side = self.track.gate_side(self._x, self._y)
+        # Lap distance accumulation
+        dx = self._x - self._lap_prev_x
+        dy = self._y - self._lap_prev_y
+        self._lap_dist   += math.hypot(dx, dy)
+        self._lap_prev_x  = self._x
+        self._lap_prev_y  = self._y
+        # ── Reward ───────────────────────────────────────────────────────────
+        #
+        # Principle: reward what we actually want — going forward along the track.
+        #
+        #   reward = -0.005                  step penalty
+        #   crash  → -15, done               off-track penalty
+        #   forward speed                    speed_norm * 0.10  (up to +0.1/step)
+        #   reversing                        speed_norm * 0.10  (negative, up to -0.04/step)
+        #   waypoint advance (forward)       +0.25 per waypoint crossed
+        #   waypoint regress (backward)      -0.25 per waypoint lost
+        #   lap completed                    +10
+        #
+        # All constants are 1/20 of the original scale to keep value targets
+        # in [-15, +10] range. This prevents value_loss explosion and allows
+        # log_std (policy exploration) to receive meaningful gradients.
+        #
+        reward = -0.005
+        obs_now = self._obs()
+        # Off-track: terminal penalty
+        if not on:
+            self._crash_count += 1
+            return obs_now, -15.0, True, {
+                "lap":           self._laps,
+                "on_track":      False,
+                "step":          self._step,
+                "crashes":       self._crash_count,
+                "lap_dist":      self._lap_dist,
+                "out_of_bounds": False,
+            }
+        # Forward speed reward — primary learning signal.
+        # Positive when moving forward, negative when reversing.
+        # This alone is enough to stop the spinning: spinning gives speed ≈ 0 → reward ≈ 0.
+        speed_norm = self._speed / self.track.max_speed   # [-0.4, 1.0]
+        reward += speed_norm * 0.10
+        # Waypoint progress: flat bonus/penalty per waypoint crossed.
+        # Drives the policy to steer toward the track rather than drive in a
+        # straight line off it — steering toward wp is the only way to advance.
+        new_wp = self._nearest_wp(self._x, self._y)
+        diff = new_wp - self._wp_idx
+        n = self._n_wps
+        if diff > n // 2:
+            diff -= n
+        elif diff < -n // 2:
+            diff += n
+        if diff > 0:
+            reward += 0.25 * diff    # +0.25 per waypoint advanced forward
+        elif diff < 0:
+            reward -= 0.25 * abs(diff)   # -0.25 per waypoint lost going backward
+        self._wp_idx = new_wp
+        # Lap completion — two-phase arm/trigger to reliably detect crossings.
+        # Phase 1 (arm): car must travel 50px past the gate going forward.
+        # Phase 2 (trigger): car crosses back through the gate (prev<0 → curr>=0).
+        # Anti-shortcut gate: must have traveled 80% of optimal lap distance.
+        if not self._gate_armed and curr_side > 50.0:
+            self._gate_armed = True
+        lap_done = (self._gate_armed
+                    and self._prev_side < 0.0 and curr_side >= 0.0
+                    and self._speed > 0.3
+                    and self._lap_dist >= self.track.optimal_dist * 0.8)
+        if lap_done:
+            self._laps       += 1
+            self._gate_armed  = False   # re-arm for next lap
+            reward           += 10.0    # lap bonus
+            self._lap_dist    = 0.0
+            self._lap_prev_x  = self._x
+            self._lap_prev_y  = self._y
+        self._prev_side = curr_side
+        out_of_bounds = not (0 <= self._x < 900 and 0 <= self._y < 600)
+        if out_of_bounds:
+            reward = -15.0
+        done = (out_of_bounds
+                or self._laps >= self.laps_target
+                or self._step >= self.max_steps)
+        return self._obs(), reward, done, {
+            "lap":           self._laps,
+            "on_track":      True,
+            "step":          self._step,
+            "crashes":       self._crash_count,
+            "lap_dist":      self._lap_dist,
+            "out_of_bounds": out_of_bounds,
+        }
+    # ── Internal ─────────────────────────────────────────────────────────────
+    def _nearest_wp(self, x, y):
+        """Return index of the nearest centreline waypoint to (x, y)."""
+        dx = self._wp_x - x
+        dy = self._wp_y - y
+        return int(np.argmin(dx * dx + dy * dy))
+    def _update_physics(self, accel, steer):
+        ms = self.track.max_speed
+        ratio = min(abs(self._speed) / ms, 1.0) if ms > 0 else 1.0
+        self._angle += steer * self.STEER_DEG * max(0.3, ratio)
+        if accel > 0:
+            self._speed = min(self._speed + self.ACCEL * accel, ms)
+        elif accel < 0:
+            self._speed = max(self._speed + self.BRAKE_DECEL * accel,
+                              -ms * 0.4)
+        if self._speed > 0:
+            self._speed = max(0.0, self._speed - self.FRICTION)
+        elif self._speed < 0:
+            self._speed = min(0.0, self._speed + self.FRICTION)
+        if not self.track.on_track(self._x, self._y):
+            self._speed *= 0.80
+        rad = math.radians(self._angle)
+        self._x += self._speed * math.cos(rad)
+        self._y += self._speed * math.sin(rad)
+    # Ray angles relative to heading (degrees). Covers lateral + diagonal + forward.
+    _RAY_ANGLES = [-90, -45, 0, 45, 90]
+    _RAY_MAX    = 120   # max ray length in px (normalise distances to 0..1)
+    _RAY_STEP   = 2     # step size in px
+    def _raycast(self):
+        """
+        Cast 5 rays from the car at fixed angles relative to heading.
+        Returns list of 5 floats in [0, 1]:
+            1.0 = boundary is MAX px away (clear road)
+            0.0 = boundary is right at the car (on the edge / off track)
+        Left/right rays give lateral clearance; diagonal/front give lookahead.
+        """
+        results = []
+        for rel_deg in self._RAY_ANGLES:
+            abs_rad = math.radians(self._angle + rel_deg)
+            dx = math.cos(abs_rad) * self._RAY_STEP
+            dy = math.sin(abs_rad) * self._RAY_STEP
+            px, py = self._x, self._y
+            dist = 0.0
+            while dist < self._RAY_MAX:
+                px += dx
+                py += dy
+                dist += self._RAY_STEP
+                if not self.track.on_track(px, py):
+                    break
+            results.append(dist / self._RAY_MAX)
+        return results
+    def _obs(self):
+        t    = self.track
+        rays = self._raycast()   # 5 floats: left, front-left, front, front-right, right
+        ang_vel = self._angle_delta / self.STEER_DEG   # ≈ [-1, 1]
+        # GPS: direction to the NEXT waypoint relative to the car's current heading.
+        # sin < 0 → waypoint is to the left  (steer left)
+        # sin > 0 → waypoint is to the right (steer right)
+        # cos ≈ 1 → waypoint is straight ahead (keep going)
+        next_idx = (self._wp_idx + 10) % self._n_wps
+        dx = self._wp_x[next_idx] - self._x
+        dy = self._wp_y[next_idx] - self._y
+        world_angle_rad = math.atan2(dy, dx)
+        rel_angle_rad   = world_angle_rad - math.radians(self._angle)
+        wp_sin = math.sin(rel_angle_rad)
+        wp_cos = math.cos(rel_angle_rad)
+        return [
+            ang_vel,
+            self._speed / t.max_speed,
+            *rays,
+            wp_sin,   # GPS direction sin component
+            wp_cos,   # GPS direction cos component
+        ]
+def make_env(track, **kwargs):
+    """Factory: return a fresh CarEnv for the given TrackDef."""
+    return CarEnv(track, **kwargs)
+# ── Curriculum sampler ────────────────────────────────────────────────────────
+class CurriculumSampler:
+    """
+    Manages which train track to sample next.
+    Strategy: performance-gated with anti-forgetting replay.
+      * 70% of episodes → current frontier track
+      * 30% of episodes → random track from already-mastered ones
+    Advance to the next track when the rolling mean reward over
+    `window` episodes exceeds `threshold`.
+    Parameters
+    ----------
+    tracks      : ordered list of TrackDef (easy → hard)
+    threshold   : mean episode reward required to advance
+    window      : rolling window size for reward averaging
+    replay_frac : fraction of episodes sampled from mastered tracks
+    """
+    def __init__(self, tracks, threshold=30.0, window=50, replay_frac=0.3):
+        self.tracks       = tracks
+        self.threshold    = threshold
+        self.window       = window
+        self.replay_frac  = replay_frac
+        self._idx            = 0              # current frontier index
+        self._replay_counter = 0              # round-robin index into mastered tracks
+        self._rewards     = deque(maxlen=window)
+        self._crashes     = deque(maxlen=window)   # crashes per episode (all)
+        self._laps        = deque(maxlen=window)   # laps completed per episode (all)
+        self._is_frontier = deque(maxlen=window)   # True when episode was on frontier track
+        # Dedicated frontier-only deques so replay episodes never take up slots.
+        self._frontier_crashes = deque(maxlen=window)
+        self._frontier_laps    = deque(maxlen=window)
+    @property
+    def current_level(self):
+        return self._idx                   # 0-based index into self.tracks
+    @property
+    def current_track(self):
+        return self.tracks[self._idx]
+    @property
+    def mastered(self):
+        return self.tracks[:self._idx]
+    @property
+    def frontier_track(self):
+        return self.tracks[self._idx]
+    def sample(self):
+        """Return the TrackDef to use for the next episode.
+        Replay uses round-robin so every mastered track gets equal coverage,
+        preventing early tracks from being starved as the curriculum grows.
+        """
+        if self._idx > 0 and random.random() < self.replay_frac:
+            track = self.mastered[self._replay_counter % self._idx]
+            self._replay_counter += 1
+            return track
+        return self.frontier_track
+    def record(self, episode_reward, episode_crashes=0, episode_laps=0, is_frontier=True):
+        """Call after each episode with the total reward, crash count, and lap count."""
+        self._rewards.append(episode_reward)
+        self._crashes.append(episode_crashes)
+        self._laps.append(episode_laps)
+        self._is_frontier.append(is_frontier)
+        if is_frontier:
+            self._frontier_crashes.append(episode_crashes)
+            self._frontier_laps.append(episode_laps)
+    def should_advance(self):
+        """
+        True when every episode in the frontier window (last `window` frontier
+        episodes) completed a lap with zero crashes.  Replay episodes have their
+        own slots and never displace frontier entries from the window.
+        """
+        if self._idx >= len(self.tracks) - 1:
+            return False
+        if len(self._frontier_crashes) < self.window:
+            return False
+        return all(l >= 1 and c == 0
+                   for l, c in zip(self._frontier_laps, self._frontier_crashes))
+    def advance(self):
+        """Move to the next track. Clears all rolling buffers."""
+        if self._idx < len(self.tracks) - 1:
+            self._idx += 1
+            self._rewards.clear()
+            self._crashes.clear()
+            self._laps.clear()
+            self._is_frontier.clear()
+            self._frontier_crashes.clear()
+            self._frontier_laps.clear()
+            return True
+        return False
+    @property
+    def rolling_crashes(self):
+        """Mean crashes per episode over the current window."""
+        return statistics.mean(self._crashes) if self._crashes else float("nan")
+    @property
+    def rolling_laps(self):
+        """Mean laps per episode over the current window."""
+        return statistics.mean(self._laps) if self._laps else float("nan")
+    def status(self):
+        mean     = statistics.mean(self._rewards) if self._rewards else float("nan")
+        crashes  = statistics.mean(self._crashes) if self._crashes else float("nan")
+        t        = self.frontier_track
+        effective = self.threshold * t.complexity
+        crash_free = all(c == 0 for c in self._crashes) if self._crashes else False
+        return (f"Frontier: track {t.level} '{t.name}'  "
+                f"[{self._idx+1}/{len(self.tracks)}]  "
+                f"rolling_mean={mean:.2f}  threshold={effective:.2f}  "
+                f"crashes/ep={crashes:.2f}  crash_free={crash_free}")
+# ── Evaluator ─────────────────────────────────────────────────────────────────
+class Evaluator:
+    """
+    Runs a fixed number of greedy episodes on a list of tracks
+    and returns per-track and aggregate metrics.
+    agent_fn : callable(obs) → action   (e.g. your policy's greedy forward pass)
+    """
+    def __init__(self, n_episodes=20, max_steps=3000, laps_target=3):
+        self.n_episodes  = n_episodes
+        self.max_steps   = max_steps
+        self.laps_target = laps_target
+    def run(self, agent_fn, tracks):
+        """
+        Returns dict:
+            {
+              "per_track": [ { "level", "name", "tier", "mean_reward",
+                               "mean_laps", "completion_rate" }, ... ],
+              "mean_reward":      float,
+              "mean_laps":        float,
+              "completion_rate":  float,   # fraction of episodes with ≥1 lap
+            }
+        """
+        per_track = []
+        all_rewards, all_laps, all_complete = [], [], []
+        for track in tracks:
+            ep_rewards, ep_laps = [], []
+            for _ in range(self.n_episodes):
+                env  = make_env(track, max_steps=self.max_steps,
+                                laps_target=self.laps_target)
+                obs  = env.reset()
+                done = False
+                total_r = 0.0
+                while not done:
+                    action = agent_fn(obs)
+                    obs, r, done, _ = env.step(action)
+                    total_r += r
+                ep_rewards.append(total_r)
+                ep_laps.append(env.laps)
+            completion = sum(1 for l in ep_laps if l >= 1) / self.n_episodes
+            per_track.append({
+                "level":           track.level,
+                "name":            track.name,
+                "tier":            difficulty_of(track),
+                "mean_reward":     statistics.mean(ep_rewards),
+                "std_reward":      statistics.stdev(ep_rewards) if len(ep_rewards) > 1 else 0.0,
+                "mean_laps":       statistics.mean(ep_laps),
+                "completion_rate": completion,
+            })
+            all_rewards.extend(ep_rewards)
+            all_laps.extend(ep_laps)
+            all_complete.extend([l >= 1 for l in ep_laps])
+        return {
+            "per_track":       per_track,
+            "mean_reward":     statistics.mean(all_rewards),
+            "mean_laps":       statistics.mean(all_laps),
+            "completion_rate": sum(all_complete) / len(all_complete),
+        }
+    @staticmethod
+    def print_report(metrics, title="Evaluation"):
+        print(f"\n{'='*60}")
+        print(f"  {title}")
+        print(f"{'='*60}")
+        print(f"  {'Lvl':<4} {'Name':<24} {'Tier':<16} "
+              f"{'Reward':>8} {'Laps':>6} {'Done%':>6}")
+        print(f"  {'-'*66}")
+        for r in metrics["per_track"]:
+            print(f"  {r['level']:<4} {r['name']:<24} {r['tier']:<16} "
+                  f"{r['mean_reward']:>8.1f} {r['mean_laps']:>6.2f} "
+                  f"{r['completion_rate']*100:>5.0f}%")
+        print(f"  {'-'*66}")
+        print(f"  {'AGGREGATE':<44} "
+              f"{metrics['mean_reward']:>8.1f} {metrics['mean_laps']:>6.2f} "
+              f"{metrics['completion_rate']*100:>5.0f}%")
+        print(f"{'='*60}\n")
+# ── Split summary (run as script) ─────────────────────────────────────────────
+if __name__ == "__main__":
+    print("\n20-Track Curriculum Splits")
+    print("=" * 60)
+    for split_name, split_tracks in [("TRAIN", TRAIN), ("VAL", VAL), ("TEST", TEST)]:
+        print(f"\n{split_name}  ({len(split_tracks)} tracks)")
+        print(f"  {'Lvl':<4} {'Name':<24} {'Tier':<16} {'Width':>6} {'MaxSpd':>7}")
+        print(f"  {'-'*58}")
+        for t in split_tracks:
+            print(f"  {t.level:<4} {t.name:<24} {difficulty_of(t):<16} "
+                  f"{t.width:>6} {t.max_speed:>7.1f}")
+    print("\nSplit rationale:")
+    print("  TRAIN  - 2 tracks per difficulty tier, ordered easy->hard for curriculum")
+    print("  VAL    - 1 track per tier (within-tier generalisation check)")
+    print("  TEST   - 1 track per tier (held out entirely; final evaluation only)")

game/test_tracks.py ADDED Viewed

	@@ -0,0 +1,83 @@

+"""
+Headless automated test for all 16 tracks.
+Exit 0 if all pass, 1 if any fail.
+"""
+import os
+os.environ['SDL_VIDEODRIVER'] = 'dummy'
+os.environ['SDL_AUDIODRIVER'] = 'dummy'
+import sys
+import math
+import pygame
+pygame.init()
+pygame.display.set_mode((1, 1))
+from game.tracks import TRACKS, SCREEN_W, SCREEN_H
+ACCEL     = 0.13
+STEER_DEG = 2.7
+all_pass = True
+for track in TRACKS:
+    name = f"Lv{track.level}: {track.name}"
+    try:
+        # 1. Build must not raise
+        track.build()
+        # 2. surface not None, correct size
+        assert track.surface is not None, "surface is None"
+        assert track.surface.get_size() == (SCREEN_W, SCREEN_H), \
+            f"surface size {track.surface.get_size()} != ({SCREEN_W},{SCREEN_H})"
+        # 3. mask not None
+        assert track.mask is not None, "mask is None"
+        # 4. start_pos is on track
+        sx, sy = track.start_pos
+        assert track.on_track(sx, sy), \
+            f"start_pos {track.start_pos} not on track"
+        # 5. gate_side at start_pos ≈ 0
+        gs = track.gate_side(sx, sy)
+        assert abs(gs) < 2.0, \
+            f"gate_side at start_pos = {gs:.4f}, expected < 2.0"
+        # 6. Simulate 150 steps
+        x     = float(sx)
+        y     = float(sy)
+        angle = float(track.start_angle)
+        speed = 0.0
+        max_speed = track.max_speed
+        for step in range(150):
+            accel = 1   # constant throttle
+            steer = math.sin(step * 0.15) * 0.5  # gentle sinusoidal steer
+            speed_ratio = min(abs(speed) / max_speed, 1.0) if max_speed > 0 else 0
+            angle += steer * STEER_DEG * max(0.3, speed_ratio)
+            speed  = min(speed + ACCEL, max_speed)
+            speed  = max(0.0, speed - 0.038)  # friction
+            rad = math.radians(angle)
+            x  += speed * math.cos(rad)
+            y  += speed * math.sin(rad)
+            # on_track check (no crash required)
+            _ = track.on_track(x, y)
+        print(f"PASS  {name}")
+    except Exception as e:
+        print(f"FAIL  {name}: {e}")
+        all_pass = False
+print()
+if all_pass:
+    print("All 16 tracks PASSED.")
+    sys.exit(0)
+else:
+    print("Some tracks FAILED.")
+    sys.exit(1)

game/tracks.py ADDED Viewed

	@@ -0,0 +1,397 @@

+"""
+tracks.py — Track definitions for the curriculum car racer.
+Angle convention (pygame y-down):
+  0°  = right (+x)
+  90° = down  (+y)
+  180° = left  (-x)
+  270° = up    (-y)
+"""
+import math
+import pygame
+SCREEN_W, SCREEN_H = 900, 600
+# Colours
+C_GRASS = (45, 110, 45)
+C_TRACK = (52, 52, 52)
+C_WHITE = (255, 255, 255)
+# ────────────────────────────────────────────────────────────────────────────
+# Geometry helpers
+# ────────────────────────────────────────────────────────────────────────────
+def _arc(cx, cy, rx, ry, a0_deg, a1_deg, n=24):
+    """Return n+1 points along an elliptical arc from a0_deg to a1_deg."""
+    pts = []
+    for i in range(n + 1):
+        t = a0_deg + (a1_deg - a0_deg) * i / n
+        rad = math.radians(t)
+        x = cx + rx * math.cos(rad)
+        y = cy + ry * math.sin(rad)
+        pts.append((x, y))
+    return pts
+def _full_ellipse(cx, cy, rx, ry, n=80, start_deg=90):
+    """Return n+1 points of a full ellipse starting at start_deg."""
+    return _arc(cx, cy, rx, ry, start_deg, start_deg + 360, n)
+def _dense_poly(corners, step=20, segment_widths=None):
+    """
+    Sample a closed straight-segment polygon at ~step-px intervals.
+    Analogous to _arc() for polygon tracks: produces dense waypoints so the
+    +10 lookahead in CarEnv._obs() gives meaningful corner anticipation.
+    If segment_widths (one value per corner segment) is provided, returns
+    (waypoints, expanded_widths) with widths broadcast to the dense point list.
+    Otherwise returns just the waypoints list.
+    """
+    result = []
+    expanded_sw = [] if segment_widths is not None else None
+    n = len(corners)
+    for i in range(n):
+        x0, y0 = corners[i]
+        x1, y1 = corners[(i + 1) % n]
+        seg_len = math.hypot(x1 - x0, y1 - y0)
+        n_pts = max(2, int(seg_len / step))
+        for k in range(n_pts):
+            t = k / n_pts
+            result.append((x0 + t * (x1 - x0), y0 + t * (y1 - y0)))
+        if expanded_sw is not None:
+            expanded_sw.extend([segment_widths[i]] * n_pts)
+    if segment_widths is not None:
+        return result, expanded_sw
+    return result
+def _ipts(pts):
+    """Convert float point list to integer tuples."""
+    return [(int(round(x)), int(round(y))) for x, y in pts]
+# ────────────────────────────────────────────────────────────────────────────
+# TrackDef
+# ────────────────────────────────────────────────────────────────────────────
+class TrackDef:
+    def __init__(self, level, name, waypoints, width, start_pos, start_angle, max_speed,
+                 segment_widths=None):
+        self.level = level
+        self.name = name
+        self.waypoints = waypoints      # list of (x,y) floats
+        self.width = width
+        # Per-segment widths for variable-width tracks (one value per waypoint,
+        # applied to the segment FROM that waypoint TO the next).
+        # None = uniform self.width everywhere.
+        self.segment_widths = segment_widths
+        self.start_pos = start_pos      # (x, y) floats
+        self.start_angle = start_angle  # degrees
+        self.max_speed = max_speed
+        self.surface = None
+        self.mask = None
+        self.hud_corner = (8, 8)  # default; updated after build()
+        # Unit vector in start_angle direction (for gate_side)
+        rad = math.radians(start_angle)
+        self._gate_dx = math.cos(rad)
+        self._gate_dy = math.sin(rad)
+        # ── Reward metadata (computed once here, used by CarEnv) ─────────────
+        # Perimeter of the waypoint polygon = approximate track centerline length
+        self.optimal_dist = sum(
+            math.hypot(waypoints[(i + 1) % len(waypoints)][0] - waypoints[i][0],
+                       waypoints[(i + 1) % len(waypoints)][1] - waypoints[i][1])
+            for i in range(len(waypoints))
+        )
+        # Expected lap time (frames) at 70 % of max speed — accounts for corners
+        self.par_time_steps = self.optimal_dist / (max_speed * 0.70)
+        # Difficulty multiplier: narrow + fast = harder.
+        # For variable-width tracks the *choke* (minimum segment) sets difficulty.
+        _BASE_WIDTH = 115.0
+        _BASE_SPEED = 3.0
+        eff_w = min(segment_widths) if segment_widths else width
+        self.complexity = (_BASE_WIDTH / eff_w) * (max_speed / _BASE_SPEED)
+        # Road width at the start/finish line (for checkered flag rendering).
+        # For variable-width tracks, find the width of the segment nearest to start.
+        if segment_widths is not None:
+            sx, sy = start_pos
+            nearest = min(range(len(waypoints)),
+                          key=lambda i: math.hypot(waypoints[i][0] - sx,
+                                                   waypoints[i][1] - sy))
+            self._start_road_width = segment_widths[nearest]
+        else:
+            self._start_road_width = width
+    def _best_hud_corner(self, panel_w, panel_h, margin=8):
+        """Return (x, y) of the screen corner with fewest track pixels under the HUD panel."""
+        corners = [
+            (margin, margin),
+            (SCREEN_W - panel_w - margin, margin),
+            (margin, SCREEN_H - panel_h - margin),
+            (SCREEN_W - panel_w - margin, SCREEN_H - panel_h - margin),
+        ]
+        best_pos, best_count = corners[0], float('inf')
+        for cx, cy in corners:
+            count = sum(
+                1
+                for px in range(cx, cx + panel_w, 6)
+                for py in range(cy, cy + panel_h, 6)
+                if self.mask.get_at((px, py))[0] > 128
+            )
+            if count < best_count:
+                best_count, best_pos = count, (cx, cy)
+        return best_pos
+    def build(self):
+        """Draw the track onto self.surface and build self.mask."""
+        BORDER = 6   # white border thickness on each edge (pixels)
+        surf = pygame.Surface((SCREEN_W, SCREEN_H))
+        surf.fill(C_GRASS)
+        ipts_list = _ipts(self.waypoints)
+        n = len(ipts_list)
+        if self.segment_widths is None:
+            # ── Uniform-width path (original behaviour) ──────────────────────
+            r      = self.width // 2
+            r_out  = r + BORDER
+            pygame.draw.lines(surf, C_WHITE, True, ipts_list, self.width + BORDER * 2)
+            for pt in ipts_list:
+                pygame.draw.circle(surf, C_WHITE, pt, r_out)
+            pygame.draw.lines(surf, C_TRACK, True, ipts_list, self.width)
+            for pt in ipts_list:
+                pygame.draw.circle(surf, C_TRACK, pt, r)
+        else:
+            # ── Variable-width path ───────────────────────────────────────────
+            # At each waypoint junction the circle radius is the max of the
+            # incoming and outgoing segment widths, ensuring no gaps at
+            # wide→narrow or narrow→wide transitions.
+            sw = self.segment_widths
+            # Pass 1: white outer strip
+            for i in range(n):
+                j   = (i + 1) % n
+                w   = sw[i] + BORDER * 2
+                w_p = sw[(i - 1) % n] + BORDER * 2
+                pygame.draw.line(surf, C_WHITE, ipts_list[i], ipts_list[j], w)
+                pygame.draw.circle(surf, C_WHITE, ipts_list[i], max(w, w_p) // 2)
+            # Pass 2: grey tarmac
+            for i in range(n):
+                j   = (i + 1) % n
+                w   = sw[i]
+                w_p = sw[(i - 1) % n]
+                pygame.draw.line(surf, C_TRACK, ipts_list[i], ipts_list[j], w)
+                pygame.draw.circle(surf, C_TRACK, ipts_list[i], max(w, w_p) // 2)
+        # Checkered start / finish line across the full road width
+        self._draw_start_finish(surf)
+        self.surface = surf
+        # Mask: covers the full road width (including border) so on_track
+        # returns True all the way to the white edge lines.
+        mask_surf = pygame.Surface((SCREEN_W, SCREEN_H))
+        mask_surf.fill((0, 0, 0))
+        if self.segment_widths is None:
+            r_out = self.width // 2 + BORDER
+            pygame.draw.lines(mask_surf, C_WHITE, True, ipts_list,
+                               self.width + BORDER * 2)
+            for pt in ipts_list:
+                pygame.draw.circle(mask_surf, C_WHITE, pt, r_out)
+        else:
+            sw = self.segment_widths
+            for i in range(n):
+                j   = (i + 1) % n
+                w   = sw[i] + BORDER * 2
+                w_p = sw[(i - 1) % n] + BORDER * 2
+                pygame.draw.line(mask_surf, C_WHITE, ipts_list[i], ipts_list[j], w)
+                pygame.draw.circle(mask_surf, C_WHITE, ipts_list[i], max(w, w_p) // 2)
+        self.mask = mask_surf
+        self.hud_corner = self._best_hud_corner(330, 175)
+    def _draw_start_finish(self, surf):
+        """
+        Checkered black/white flag pattern across the track at start_pos,
+        perpendicular to the driving direction.  2 rows × N columns of 10 px cells.
+        """
+        CELL = 10
+        ROWS = 2
+        sx, sy = self.start_pos
+        # Unit vectors: across the track (perp) and along the track (along)
+        perp_rad  = math.radians(self.start_angle + 90)
+        along_rad = math.radians(self.start_angle)
+        perp  = (math.cos(perp_rad),  math.sin(perp_rad))
+        along = (math.cos(along_rad), math.sin(along_rad))
+        n_cols = self._start_road_width // CELL + 4   # slightly wider than road
+        half   = n_cols / 2.0
+        for row in range(ROWS):
+            v = (row - ROWS / 2.0 + 0.5) * CELL   # offset along driving dir
+            for col in range(-int(half) - 1, int(half) + 2):
+                u = col * CELL                      # offset across track
+                color = (255, 255, 255) if (row + col) % 2 == 0 else (0, 0, 0)
+                # Four corners of this cell in screen space
+                pts = []
+                for du, dv in [(-CELL/2, -CELL/2), (CELL/2, -CELL/2),
+                                (CELL/2,  CELL/2),  (-CELL/2, CELL/2)]:
+                    px = sx + (u + du) * perp[0] + (v + dv) * along[0]
+                    py = sy + (u + du) * perp[1] + (v + dv) * along[1]
+                    pts.append((int(px), int(py)))
+                pygame.draw.polygon(surf, color, pts)
+    def on_track(self, x, y):
+        """Return True if pixel (x, y) is on the track mask."""
+        if self.mask is None:
+            return False
+        ix, iy = int(round(x)), int(round(y))
+        if ix < 0 or iy < 0 or ix >= SCREEN_W or iy >= SCREEN_H:
+            return False
+        color = self.mask.get_at((ix, iy))
+        # White = on track
+        return color[0] > 128
+    def gate_side(self, x, y):
+        """
+        Dot product of (pos - start_pos) with start direction unit vector.
+        Positive = ahead of gate, negative = behind gate.
+        """
+        dx = x - self.start_pos[0]
+        dy = y - self.start_pos[1]
+        return dx * self._gate_dx + dy * self._gate_dy
+# ────────────────────────────────────────────────────────────────────────────
+# Track builders
+# ────────────────────────────────────────────────────────────────────────────
+def _build_all_tracks():
+    tracks = []
+    # ── GROUP 1: Full ellipses ───────────────────────────────────────────────
+    # 1. Wide Oval
+    wp = _full_ellipse(450, 300, 370, 215, n=80, start_deg=90)
+    tracks.append(TrackDef(
+        level=1, name="Wide Oval",
+        waypoints=wp, width=115,
+        start_pos=(450, 515), start_angle=180, max_speed=3.0
+    ))
+    # 2. Standard Oval
+    wp = _full_ellipse(450, 300, 330, 195, n=80, start_deg=90)
+    tracks.append(TrackDef(
+        level=2, name="Standard Oval",
+        waypoints=wp, width=85,
+        start_pos=(450, 495), start_angle=180, max_speed=3.5
+    ))
+    # 3. Narrow Oval
+    wp = _full_ellipse(450, 300, 320, 185, n=80, start_deg=90)
+    tracks.append(TrackDef(
+        level=3, name="Narrow Oval",
+        waypoints=wp, width=58,
+        start_pos=(450, 485), start_angle=180, max_speed=3.5
+    ))
+    # 4. Superspeedway
+    wp = _full_ellipse(450, 300, 395, 160, n=80, start_deg=90)
+    tracks.append(TrackDef(
+        level=4, name="Superspeedway",
+        waypoints=wp, width=85,
+        start_pos=(450, 460), start_angle=180, max_speed=4.5
+    ))
+    # ── GROUP 2: Rounded rectangles ─────────────────────────────────────────
+    # 5. Rounded Rectangle
+    # TL corner at (250,230), TR at (650,230), BR at (650,370), BL at (250,370), r=130
+    # BUT with r=130, bottom of BR arc = 370+130=500, BL bottom = 370+130=500
+    # arcs: TL 180→270, TR 270→360, BR 0→90, BL 90→180
+    tl_arc = _arc(250, 230, 130, 130, 180, 270, 24)  # (120,230)→(250,100) wait...
+    # TL center (250,230): 180° → (250-130,230)=(120,230), 270° → (250,230-130)=(250,100)
+    tr_arc = _arc(650, 230, 130, 130, 270, 360, 24)  # (650,100)→(780,230)
+    br_arc = _arc(650, 370, 130, 130, 0, 90, 24)    # (780,370)→(650,500)
+    bl_arc = _arc(250, 370, 130, 130, 90, 180, 24)  # (250,500)→(120,370)
+    wp = tl_arc + tr_arc + br_arc + bl_arc
+    tracks.append(TrackDef(
+        level=5, name="Rounded Rectangle",
+        waypoints=wp, width=90,
+        start_pos=(450, 500), start_angle=180, max_speed=3.5
+    ))
+    # 6. Stadium Oval
+    left_arc = _arc(200, 300, 120, 120, 90, 270, 24)   # (200,420)→(200,180)
+    right_arc = _arc(700, 300, 120, 120, 270, 450, 24)  # (700,180)→(700,420)
+    wp = left_arc + right_arc
+    tracks.append(TrackDef(
+        level=6, name="Stadium Oval",
+        waypoints=wp, width=80,
+        start_pos=(450, 420), start_angle=180, max_speed=4.0
+    ))
+    # 7. Tight Rectangle
+    # TL=(185,195), TR=(715,195), BR=(715,405), BL=(185,405), r=65
+    tl_arc = _arc(185, 195, 65, 65, 180, 270, 24)
+    tr_arc = _arc(715, 195, 65, 65, 270, 360, 24)
+    br_arc = _arc(715, 405, 65, 65, 0, 90, 24)
+    bl_arc = _arc(185, 405, 65, 65, 90, 180, 24)
+    wp = tl_arc + tr_arc + br_arc + bl_arc
+    tracks.append(TrackDef(
+        level=7, name="Tight Rectangle",
+        waypoints=wp, width=65,
+        start_pos=(450, 470), start_angle=180, max_speed=3.5
+    ))
+    # 8. Small Oval
+    wp = _full_ellipse(450, 300, 265, 165, n=80, start_deg=90)
+    tracks.append(TrackDef(
+        level=8, name="Small Oval",
+        waypoints=wp, width=60,
+        start_pos=(450, 465), start_angle=180, max_speed=3.2
+    ))
+    # ── GROUP 3: Two half-arcs ───────────────────────────────────────────────
+    # 9. Hairpin Track
+    # Counter-clockwise to match all other tracks (start_angle=180°, facing left).
+    # arc2_rev: left tight hairpin (220,440)→(140,300)→(220,160)
+    arc2_rev = _arc(220, 300, 80, 140, 90, 270, 24)
+    # arc1_rev: right gentle (700,160)→(820,300)→(700,440)
+    arc1_rev = _arc(700, 300, 120, 140, -90, 90, 24)
+    wp = arc2_rev + arc1_rev
+    tracks.append(TrackDef(
+        level=9, name="Hairpin Track",
+        waypoints=wp, width=75,
+        start_pos=(460, 440), start_angle=180.0, max_speed=3.5
+    ))
+    # 10. Chicane Track
+    # Rounded rect with chicane on bottom
+    tl_arc = _arc(250, 240, 100, 100, 180, 270, 24)
+    tr_arc = _arc(650, 240, 100, 100, 270, 360, 24)
+    br_arc = _arc(650, 360, 100, 100, 0, 90, 24)    # ends at (650,460)
+    bl_arc = _arc(250, 360, 100, 100, 90, 180, 24)  # starts at (250,460)
+    # Chicane inserted between br_arc end and bl_arc start
+    chicane = [(650, 460), (575, 460), (545, 498), (450, 498), (355, 498), (325, 460), (250, 460)]
+    wp = tl_arc + tr_arc + br_arc + chicane + bl_arc
+    tracks.append(TrackDef(
+        level=10, name="Chicane Track",
+        waypoints=wp, width=70,
+        start_pos=(450, 498), start_angle=180, max_speed=3.5
+    ))
+    return tracks
+TRACKS = _build_all_tracks()