sakha / scripts

Commit History

final push
5a8662c
unverified

atharva-again commited on

feat(grpo): enhance action response parsing by removing reasoning blocks and refining regex handling
5202cdc

Bemohit commited on

feat(grpo): update max completion length and refine prompt handling for improved evaluation
ff1f7a0

Bemohit commited on

feat(grpo): adjust training parameters and disable thinking mode for consistent action calls
8f1e9fc

Bemohit commited on

feat(grpo): enhance training dynamics with new replay policies and update state steps
264ee3d

Bemohit commited on

feat(grpo): update max sequence length and refine prompt formatting in training scripts
79bced7

Bemohit commited on

refactor: remove SakhaEnvWrapper class and streamline reward function in GRPO training script
097c9e4

Bemohit commited on

refactor: simplify GRPO training script by removing CLI presets and syncing with notebook implementation
5f139c7

Bemohit commited on

feat(eval): consolidate eval_harness into eval_policies, add LLM policy support
fb58610
unverified

atharva-again commited on

refactor(eval): extract shared eval constants and policies into eval_common.py
51d5ddb
unverified

atharva-again commited on

feat(plots): add plot generation script and training evidence plots
fcdb8dd
unverified

atharva-again commited on

feat(train): add CLI args (learning-rate, batch-size) and full-shift defaults
f96fd88
unverified

atharva-again commited on

feat(eval): add reproducible eval harness with comprehensive metrics
fd5e667
unverified

atharva-again commited on

data(fixtures): capture pre-migration golden reward fixtures for parity testing
827cfe7
unverified

atharva-again commited on

chore: pre-merge cleanup for colab-training branch
509d302
unverified

atharva-again commited on

feat: add GRPO training script with mode presets, eval split, checkpointing
fc4da82
unverified

atharva-again commited on

scripts: add GRPO training demo for Sakha env
f221784
unverified

atharva-again commited on

fix(eval_policies): ensure output directory exists before writing JSON
d337c5c
unverified

atharva-again commited on

feat(inference): adapt to new action system
68162aa
unverified

atharva-again commited on