ChargeBackOps / training

Commit History

feat(training): outcome-based RLVR reward + clean Colab T4 notebook
1f49d52

mitudrudutta commited on

feat(training): SFT dataset + stall detection in eval rollout
02a6a9f

mitudrudutta commited on

fix(training): per-action reward scoring vs heuristic oracle
243aa68

mitudrudutta commited on

feat: enhance completion parsing to handle truncated JSON and `<think>` blocks
71f1fe0

mitudrudutta commited on

feat: add per-family evaluation and plotting for training curves
a79d430

mitudrudutta commited on

feat: Add training curve evaluation and plotting utilities with unit tests
8fe3b35

pauldebanshu19 commited on

Add training notebook and benchmark runner for ChargebackOps
bd00c06

pauldebanshu19 commited on