Commit History

fix: reduce grpo training runtime
842caac

Siddh12334 commited on

fix: parse chat completions in rewards
6f3d9d6

Siddh12334 commited on

fix: emit prompt column for grpo trainer
6ceec85

Siddh12334 commited on

fix: add grpo warnings compatibility attr
633d604

Siddh12334 commited on

fix: choose writable cache root at space startup
3125dc1

Siddh12334 commited on

fix: use writable runtime dirs in training space
eafb471

Siddh12334 commited on

fix: patch torchao dtype imports for unsloth
a262689

Siddh12334 commited on

fix: shim torch._pytree.register_constant for torchao compat
98317c2

Siddh12334 Claude Sonnet 4.6 commited on

fix: verbose per-step logs and BaseException catch in _run_training
6fc6438

Siddh12334 Claude Sonnet 4.6 commited on

fix: capture transformers/TRL/unsloth logs in Gradio UI
0b6be50

Siddh12334 Claude Sonnet 4.6 commited on

fix: install unsloth in Docker image at build time
dee46e3

Siddh12334 Claude Sonnet 4.6 commited on

fix: chunk-read pip output splitting on \r and \n
7ac7eb8

Siddh12334 Claude Sonnet 4.6 commited on

feat: stream pip install output live to UI logs
d79940c

Siddh12334 Claude Sonnet 4.6 commited on

fix: install unsloth to /app/pkgs to bypass HF Space permission issue
722cd66

Siddh12334 Claude Sonnet 4.6 commited on

fix: robust unsloth install — PyPI first, git fallback, log errors
fa5785a

Siddh12334 Claude Sonnet 4.6 commited on

feat: add A100 training Space with manual-trigger Gradio UI
77d8bcf

Siddh12334 Claude Sonnet 4.6 commited on

feat: bulletproof GRPO training script + Colab notebook
67601e4

Siddh12334 commited on

feat: baseline eval and GRPO training script
5f54992

Siddh12334 commited on

feat: initial structure
bdb156f

Siddh12334 commited on