Spaces:

qtzx06
/

0x960

Sleeping

App Files Files Community

0x960 / train

Commit History

feat: finalize swarm tooling and submission artifacts

eac9d9f

qtzx06 commited on Mar 8

feat: add rollout observability — JSONL logs + per-step action/reward printing

b0b9657

qtzx06 commited on Mar 8

fix: load 4-bit model manually before passing to GRPOTrainer

7cf0d25

qtzx06 commited on Mar 8

fix: use QLoRA (4-bit + LoRA) for Qwen3.5-9B training

93a63c7

qtzx06 commited on Mar 8

fix: add LoRA + gradient checkpointing to fit 9B on single H100

e83a908

qtzx06 commited on Mar 8

fix: drop vLLM colocate (version conflict), use native GRPO generation

eafdbce

qtzx06 commited on Mar 8

feat: rewrite training to use TRL rollout_func + OpenEnv multi-turn pattern

93f58fd

qtzx06 commited on Mar 8

feat: add --mode infer for Qwen inference test, default to Qwen3.5-9B

0b5e8b0

qtzx06 commited on Mar 8

feat: fix openenv 0.2.1 API, add deployment files and GRPO training

ea3bbb3

qtzx06 commited on Mar 8

feat: add openenv wrapper and training stub

eb29dc8

qtzx06 commited on Mar 8

chore: add project scaffolding

165e54c

qtzx06 commited on Mar 8