Commit History

fix: HF Space port + TRL install order in training Dockerfile
4201933

natnael kahssay Claude Sonnet 4.6 commited on

fix: correct model name to unsloth/gpt-oss-20b (no -instruct suffix)
7ba71bb

natnael kahssay Claude Sonnet 4.6 commited on

fix: move unsloth import first, make wandb optional via WANDB_API_KEY
bd0f19a

natnael kahssay Claude Sonnet 4.6 commited on

fix: install trl>=0.16 last with --upgrade to beat unsloth dep pins
71a483e

natnael kahssay Claude Sonnet 4.6 commited on

fix: use CUDA devel image and pin vLLM to 0.12.0
ada7c70

natnael kahssay Claude Sonnet 4.6 commited on

fix: strip ANSI codes in _run_tests() so βœ“/βœ— count correctly
6b28995

natnael kahssay commited on

feat: add W&B reward logging to both training scripts
fe33a21

natnael kahssay Claude Sonnet 4.6 commited on

feat: RFC 005 interactive rollout wrapper + multi-turn GRPO training
ded7690

natnael kahssay Claude Sonnet 4.6 commited on

feat: replace handcrafted user_messages with real MOA session traces
bb5a5ec

natnael kahssay Claude Sonnet 4.6 commited on

feat: multi-turn tool-using GRPO training
5e044f0

natnael kahssay Claude Sonnet 4.6 commited on

feat: multi-turn tool-using RL environment (RFC 005 pattern)
5d3d3ff

natnael kahssay Claude Sonnet 4.6 commited on

fix: add make+g++ for node-pty native build
002fe30

natnael kahssay commited on

feat: use real moav2 source as RL env, symlinked sandbox, demo.py
0590e15

natnael kahssay commited on

feat: use real moav2 source as RL task suite β€” symlinked sandbox, 3 real service tasks
ce25387

natnael kahssay commited on

fix: embed task content directly, self-contained vitest sandbox
38cd72d

natnael kahssay commited on

upgrade: gpt-oss-20b BF16, vLLM GRPO, Northflank env URL, max_steps=300
c844c8c

natnael kahssay commited on

add training/ as real directory (Dockerfile + train.py)
aae5554

natnael kahssay commited on

add GRPO training job (Llama 3.1 8B + Unsloth + TRL)
6dd8379

natnael kahssay commited on

fix: add README with HF Space metadata
a40b9c5

natnael kahssay commited on

add .gitignore, remove build artifacts
367d69f

natnael kahssay commited on

initial moa rl environment
50a0d81

natnael kahssay commited on