Commit History

Update oversight reward curve PNG
f74b27a
verified

Astro-Dude commited on

Update commander reward curve PNG
ed937c6
verified

Astro-Dude commited on

Use relative paths for reward curve images
1aa5832

Astro-Dude commited on

Sync gitignore and latest README/BLOG from main
114158c

Astro-Dude commited on

Use GitHub raw URLs for reward curve images
166669c

Astro-Dude commited on

Add blog post, update README with Colab link and training results
e9a2950

Astro-Dude commited on

docs: update plan.md with training results, inference benchmarks, new files
033c53a

Astro-Dude commited on

demo: exercise citation pathway + deterministic artifact + CI + walkthrough
b9182e8

Astro-Dude commited on

Fix inference metadata field, add benchmark results, update README file layout links
614c4d0

Astro-Dude commited on

Add HF Space link, results table, and reward curve to README
6c0424f

Astro-Dude commited on

Add training results: reward curves, results doc, updated dashboard
bf77795

Astro-Dude Shaurya Verma commited on

Fix oversight reward: expose oversight_reward on IncidentObservation
113917d

Astro-Dude Shaurya Verma commited on

Save full training logs to JSON + fix oversight obs.metadata crash
1e29b6e

Astro-Dude Shaurya Verma commited on

Reduce to 4 gens/200 tokens for ~60s/step on T4 (120 steps β‰ˆ 2h)
97ce225

Astro-Dude commited on

Add reward and done fields to IncidentObservation
72c691a

Astro-Dude Shaurya Verma commited on

Fix env crash: remove obs.metadata assignment missing from IncidentObservation
3f061f7

Astro-Dude Shaurya Verma commited on

Print env crash traceback in reward fn to diagnose -0.3 path
41c5282

Astro-Dude commited on

Add reward diagnostic logging to identify why rewards are flat
48dd389

Astro-Dude Shaurya Verma commited on

Tune GRPO for T4 speed: 4 gens, 256 tokens, temp 1.2
af9d280

Astro-Dude Shaurya Verma commited on

Add training dashboard with per-model reward curves and metrics
d784bbf

Astro-Dude Shaurya Verma commited on

Fix GRPO rollout collapse: temperature 0.7β†’1.0, generations 4β†’8, completion length 300β†’512
2e1ab85

Astro-Dude commited on

Add dashboard.html and fix gitignore for runs/
098567b

Astro-Dude commited on

Citadel v2.0.0 β€” Multi-Agent AI Defense Council
f0609d2

Astro-Dude commited on