Commit History

docs: drop R4-η»­ retrospective paragraph β€” it reintroduced the wrong shaping=0.5 value
8eeeb67
Running

Lee93whut Lee93whut commited on

docs: polish pass β€” README wording, experiment_log structure + honest retrospective, LaTeX micro-style
e1ecae1

Lee93whut Lee93whut commited on

chore: codebase hygiene pass β€” untrack weights, migrate to logging, tidy comments
17bc537

Lee93whut Lee93whut commited on

docs: clean up R3/R4 record and consolidate technical narrative
92423f0

Lee93whut Lee93whut commited on

docs: finalize R4 documentation β€” Dueling 84% Holdout, full ablation record
acbd4c5

Lee93whut commited on

refactor(model): update architecture docs and set dueling as default algorithm
34ad2cc

Lee93whut commited on

fix: eliminate infinite-loop risk in maze start/goal sampling
10926f0

Lee93whut commited on

feat(round4): four-algorithm ablation β€” Dueling best at 84% Holdout
44cfe4c

Lee93whut commited on

docs(round4): finalize R4 Double DQN results β€” 78% Holdout, Grid-SPL clarification
b14b412

Lee93whut commited on

chore: add MIT LICENSE, ignore .claude/ directory
18e19aa

Lee93whut commited on

chore(weights): upgrade vanilla/dueling/double_dueling to 4-channel R4 weights
3379ed4

Lee93whut commited on

fix(demo): strengthen anti-loop by penalizing moves toward high-frequency cells
a888a00

Lee93whut commited on

chore(weights): update double DQN weight to R4-A3 4-channel (78% holdout)
f3ed6b3

Lee93whut commited on

fix(demo): auto-infer input_channels from checkpoint weight shape
4f4fb4a

Lee93whut commited on

fix(demo): stop animation on any user interaction
3b43b04

Lee93whut commited on

fix(demo): re-enable inference-side anti-loop Q-penalty
c8377dc

Lee93whut commited on

add model weights via Git LFS, fix HF Space build
006f45e

Lee93whut commited on

docs: README β€” results table, architecture, quickstart, references
385cc9f

Lee93whut commited on

feat(demo): Streamlit web demo β€” Plotly heatmap, anti-loop inference
a264030

Lee93whut commited on

docs(round4): complete experiment record β€” A1/A2/A3 full EVAL data and conclusions
a91b194

Lee93whut commited on

style(train): remove forward-reference quotes from type hints (Python 3.10+)
274376b

Lee93whut commited on

fix(train): use terminated-only mask for TD bootstrap (Gymnasium v0.26)
670449d

Lee93whut commited on

fix(train): guarantee BFS-connected start/goal, bounded retry with fallback
92a3812

Lee93whut commited on

feat(round4): upgrade obs 3->4 channels (visited_map) + EVAL-based checkpoint
062d629

Lee93whut commited on

feat(round3): buffer=80k + target_freq=1500 + shaping=0.5 β†’ 74% holdout, SPL=0.735
c1b9ba8

Lee93whut commited on

feat(round2): extended training, Double DQN 64% holdout, SPL=0.633
ff1b1b8

Lee93whut commited on

feat(round1): baseline DQN variants β€” Vanilla/Double/Dueling/Double+Dueling
bf17b0c

Lee93whut commited on

ci: GitHub Actions β€” pytest + 90%+ branch coverage gate
c08fcb7

Lee93whut commited on

feat(env): Gymnasium maze env, 3-channel obs, BFS reachability
fe0625d

Lee93whut commited on

chore: initial project scaffold
141a818

Lee93whut commited on