Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
nks321
/
moa-rl-env
like
0
Sleeping
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
main
moa-rl-env
/
training
22.9 kB
Ctrl+K
Ctrl+K
2 contributors
History:
10 commits
natnael kahssay
fix: correct model name to unsloth/gpt-oss-20b (no -instruct suffix)
7ba71bb
11 days ago
.gitignore
Safe
28 Bytes
add training/ as real directory (Dockerfile + train.py)
11 days ago
Dockerfile
Safe
672 Bytes
fix: install trl>=0.16 last with --upgrade to beat unsloth dep pins
11 days ago
rollout_wrapper.py
Safe
6.82 kB
feat: RFC 005 interactive rollout wrapper + multi-turn GRPO training
11 days ago
train.py
Safe
8.48 kB
fix: correct model name to unsloth/gpt-oss-20b (no -instruct suffix)
11 days ago
train_rfc005.py
Safe
6.87 kB
feat: add W&B reward logging to both training scripts
11 days ago