Spaces:

lil58
/

interview

Running

App Files Files Community

Commit History

docs: drop R4-续 retrospective paragraph — it reintroduced the wrong shaping=0.5 value

8eeeb67

Running

Lee93whut Lee93whut commited on 3 days ago

docs: polish pass — README wording, experiment_log structure + honest retrospective, LaTeX micro-style

e1ecae1

Lee93whut Lee93whut commited on 3 days ago

chore: codebase hygiene pass — untrack weights, migrate to logging, tidy comments

17bc537

Lee93whut Lee93whut commited on 3 days ago

docs: clean up R3/R4 record and consolidate technical narrative

92423f0

Lee93whut Lee93whut commited on 3 days ago

docs: finalize R4 documentation — Dueling 84% Holdout, full ablation record

acbd4c5

Lee93whut commited on 3 days ago

refactor(model): update architecture docs and set dueling as default algorithm

34ad2cc

Lee93whut commited on 3 days ago

fix: eliminate infinite-loop risk in maze start/goal sampling

10926f0

Lee93whut commited on 3 days ago

feat(round4): four-algorithm ablation — Dueling best at 84% Holdout

44cfe4c

Lee93whut commited on 3 days ago

docs(round4): finalize R4 Double DQN results — 78% Holdout, Grid-SPL clarification

b14b412

Lee93whut commited on 3 days ago

chore: add MIT LICENSE, ignore .claude/ directory

18e19aa

Lee93whut commited on 3 days ago

chore(weights): upgrade vanilla/dueling/double_dueling to 4-channel R4 weights

3379ed4

Lee93whut commited on 3 days ago

fix(demo): strengthen anti-loop by penalizing moves toward high-frequency cells

a888a00

Lee93whut commited on 4 days ago

chore(weights): update double DQN weight to R4-A3 4-channel (78% holdout)

f3ed6b3

Lee93whut commited on 4 days ago

fix(demo): auto-infer input_channels from checkpoint weight shape

4f4fb4a

Lee93whut commited on 4 days ago

fix(demo): stop animation on any user interaction

3b43b04

Lee93whut commited on 4 days ago

fix(demo): re-enable inference-side anti-loop Q-penalty

c8377dc

Lee93whut commited on 4 days ago

add model weights via Git LFS, fix HF Space build

006f45e

Lee93whut commited on 4 days ago

docs: README — results table, architecture, quickstart, references

385cc9f

Lee93whut commited on 4 days ago

feat(demo): Streamlit web demo — Plotly heatmap, anti-loop inference

a264030

Lee93whut commited on 4 days ago

docs(round4): complete experiment record — A1/A2/A3 full EVAL data and conclusions

a91b194

Lee93whut commited on 4 days ago

style(train): remove forward-reference quotes from type hints (Python 3.10+)

274376b

Lee93whut commited on 4 days ago

fix(train): use terminated-only mask for TD bootstrap (Gymnasium v0.26)

670449d

Lee93whut commited on 4 days ago

fix(train): guarantee BFS-connected start/goal, bounded retry with fallback

92a3812

Lee93whut commited on 4 days ago

feat(round4): upgrade obs 3->4 channels (visited_map) + EVAL-based checkpoint

062d629

Lee93whut commited on 4 days ago

feat(round3): buffer=80k + target_freq=1500 + shaping=0.5 → 74% holdout, SPL=0.735

c1b9ba8

Lee93whut commited on 4 days ago

feat(round2): extended training, Double DQN 64% holdout, SPL=0.633

ff1b1b8

Lee93whut commited on 4 days ago

feat(round1): baseline DQN variants — Vanilla/Double/Dueling/Double+Dueling

bf17b0c

Lee93whut commited on 4 days ago

ci: GitHub Actions — pytest + 90%+ branch coverage gate

c08fcb7

Lee93whut commited on 4 days ago

feat(env): Gymnasium maze env, 3-channel obs, BFS reachability

fe0625d

Lee93whut commited on 4 days ago

chore: initial project scaffold

141a818

Lee93whut commited on 4 days ago