feat: reward verifier alignment, notebook hardening, model name fix cdc237b Running CreativeEngineer Claude Opus 4.6 commited on 6 days ago
refactor: replace unsloth with plain transformers+peft for GRPO training 3313e24 CreativeEngineer Claude Opus 4.6 commited on 6 days ago
feat: upgrade notebook to Qwen3.5-4B with H100 hyperparams 2cb6617 CreativeEngineer Claude Opus 4.6 commited on 6 days ago
feat: add real-time stellarator optimization demo animation 3b185f9 CreativeEngineer Claude Opus 4.6 commited on 6 days ago
refactor: align colab notebook with shared llm helpers ddcb837 CreativeEngineer commited on 6 days ago
fix: robust JSON array extraction and notebook GRPO fixes e826e11 CreativeEngineer Claude Opus 4.6 commited on 6 days ago
feat: polish notebook and README for hackathon submission c647aa0 CreativeEngineer Claude Opus 4.6 commited on 6 days ago
feat: add llm rollout contract and simplify ppo smoke ebd0ff3 CreativeEngineer commited on 6 days ago
feat: add HF Space deployment + GRPO training notebook 3bfd80a CreativeEngineer Claude Opus 4.6 commited on 6 days ago
feat: add replay playtest and tighten fail-fast validation 8bf0155 CreativeEngineer commited on 6 days ago