feat: reward verifier alignment, notebook hardening, model name fix cdc237b CreativeEngineer Claude Opus 4.6 commited on Mar 8
refactor: replace unsloth with plain transformers+peft for GRPO training 3313e24 CreativeEngineer Claude Opus 4.6 commited on Mar 8
feat: upgrade notebook to Qwen3.5-4B with H100 hyperparams 2cb6617 CreativeEngineer Claude Opus 4.6 commited on Mar 8
feat: add real-time stellarator optimization demo animation 3b185f9 CreativeEngineer Claude Opus 4.6 commited on Mar 8
fix: robust JSON array extraction and notebook GRPO fixes e826e11 CreativeEngineer Claude Opus 4.6 commited on Mar 8
feat: polish notebook and README for hackathon submission c647aa0 CreativeEngineer Claude Opus 4.6 commited on Mar 8
feat: add HF Space deployment + GRPO training notebook 3bfd80a CreativeEngineer Claude Opus 4.6 commited on Mar 8
feat: add replay playtest and tighten fail-fast validation 8bf0155 CreativeEngineer commited on Mar 8