Sentinel / training

Commit History

Update Colab notebook: 1.5B model, scaled rewards, tuned hyperparameters
ee8c2d4
Running

nihalaninihal commited on

Align with Advanced Llama 3.2 GRPO LoRA reference notebook pattern
c7d253a

nihalaninihal Claude Opus 4.6 commited on

Fix format_comparison_metrics_html to accept run_comparison() dict directly
d52b449

nihalaninihal Claude Opus 4.6 commited on

Align train.py and Colab notebook with official Unsloth+OpenEnv GRPO patterns
e09a415

nihalaninihal Claude Opus 4.6 commited on

Update metrics format with drift/oversight tracking, add colab training notebook
5e0f2b1

nihalaninihal commited on