Commit History

Remove unused 3B follow-up chart
e06e0ce
verified

rishavutk commited on

Reframe 3B follow-up as trial run
0553162
verified

rishavutk commited on

Add 3B GRPO follow-up evidence
c89a7a9
verified

rishavutk commited on

Align warehouse GRPO prompt with SFT
0ac1e98
verified

rishavutk commited on

Fix adapter eval for 3B notebooks
c66fcbe
verified

rishavutk commited on

Refresh 3B notebook eval path
83bf1ef
verified

rishavutk commited on

Refresh training notebook clone step
ea9a79c
verified

rishavutk commited on

Expose generation batch size in 3B notebook
9592f16
verified

rishavutk commited on

Expose GRPO generation batch size
87004db
verified

rishavutk commited on

Add 4bit SFT training knobs
eee84e5
verified

rishavutk commited on

Add 4bit GRPO training knobs
bbb6307
verified

rishavutk commited on

Tune 3B notebook for Colab GPU
491993a
verified

rishavutk commited on

Allow configurable SFT base model
6d2f03a
verified

rishavutk commited on

Allow configurable eval base model
bfc3f5b
verified

rishavutk commited on

Allow configurable GRPO base model
130e9e4
verified

rishavutk commited on

Add 3B GRPO Colab notebook
6f3e2d3
verified

rishavutk commited on

make it more colab friendly
c0a893e
verified

rishavutk commited on

Fix training notebook smoke test
cbf66d6
verified

rishavutk commited on

Remove Colab folder from Space
67adddd
verified

rishavutk commited on

Embed evidence plots in README
82d40c1
verified

rishavutk commited on

Improve blog narrative opening
2114f64
verified

rishavutk commited on

Embed evidence plots in blog
a8cf751
verified

rishavutk commited on

Upload blog evidence plots
56a3cf9
verified

rishavutk commited on

Add playable demo links
f39ab13

Rishav commited on

Refine SupplyMind blog evidence
4054708

Rishav commited on

Add SupplyMind blog draft
cc7a029

Rishav commited on

Add runnable training notebook
00ebacd

Rishav commited on

Remove submission plots from HF package
e069cc9

Rishav commited on

Clean SupplyMind submission for HF
c186723

Rishav commited on

Pin latest OpenEnv runtime
32b510c

Rishav commited on

Add final eval set
0af60f8

Rishav commited on

Add conservative warehouse SFT evidence
cb5e5bf

Rishav commited on

Bias center SFT toward action states
939eba0

Rishav commited on

Tighten role training scaffold
a2144da

Rishav commited on

Add eval action diagnostics
ff949e6

Rishav commited on

Add held-out improvement visuals
d211fba

Rishav commited on

Split dashboard curves by training phase
7579e54

Rishav commited on

Add SFT warm start training pipeline
be8d222

Rishav commited on

Add training dashboard UI
8258b94

Rishav commited on

Add HF adapter evaluation job
45cc878

Rishav commited on

Disable Trackio checkpoint sync
aa7d416

Rishav commited on

Tune GRPO completion length
cce2fba

Rishav commited on

Make GRPO config version tolerant
15084c8

Rishav commited on

Fix HF training config resolution
d77a60c

Rishav commited on

Harden HF training startup
1cd6456

Rishav commited on

Add role-specific training scores
37049ad

Rishav commited on

Add training progress logs
a7160ed

Rishav commited on

Fix training smoke task id
5ca7724

Rishav commited on

Add HF role GRPO training job
3f1eabc

Rishav commited on

Align center reward with service profit
f058a59

Rishav commited on