Spaces:

atharva-again
/

sakha

Sleeping

App Files Files Community

sakha / scripts

Commit History

final push

5a8662c
unverified

atharva-again commited on 29 days ago

feat(grpo): enhance action response parsing by removing reasoning blocks and refining regex handling

5202cdc

Bemohit commited on 29 days ago

feat(grpo): update max completion length and refine prompt handling for improved evaluation

ff1f7a0

Bemohit commited on 29 days ago

feat(grpo): adjust training parameters and disable thinking mode for consistent action calls

8f1e9fc

Bemohit commited on 29 days ago

feat(grpo): enhance training dynamics with new replay policies and update state steps

264ee3d

Bemohit commited on 29 days ago

feat(grpo): update max sequence length and refine prompt formatting in training scripts

79bced7

Bemohit commited on 30 days ago

refactor: remove SakhaEnvWrapper class and streamline reward function in GRPO training script

097c9e4

Bemohit commited on 30 days ago

refactor: simplify GRPO training script by removing CLI presets and syncing with notebook implementation

5f139c7

Bemohit commited on 30 days ago

feat(eval): consolidate eval_harness into eval_policies, add LLM policy support

fb58610
unverified

atharva-again commited on about 1 month ago

refactor(eval): extract shared eval constants and policies into eval_common.py

51d5ddb
unverified

atharva-again commited on about 1 month ago

feat(plots): add plot generation script and training evidence plots

fcdb8dd
unverified

atharva-again commited on about 1 month ago

feat(train): add CLI args (learning-rate, batch-size) and full-shift defaults

f96fd88
unverified

atharva-again commited on about 1 month ago

feat(eval): add reproducible eval harness with comprehensive metrics

fd5e667
unverified

atharva-again commited on about 1 month ago

data(fixtures): capture pre-migration golden reward fixtures for parity testing

827cfe7
unverified

atharva-again commited on about 1 month ago

chore: pre-merge cleanup for colab-training branch

509d302
unverified

atharva-again commited on Apr 22

feat: add GRPO training script with mode presets, eval split, checkpointing

fc4da82
unverified

atharva-again commited on Apr 21

scripts: add GRPO training demo for Sakha env

f221784
unverified

atharva-again commited on Apr 20

fix(eval_policies): ensure output directory exists before writing JSON

d337c5c
unverified

atharva-again commited on Apr 7

feat(inference): adapt to new action system

68162aa
unverified

atharva-again commited on Apr 5

initial cook

c3fa67d
unverified

atharva-again commited on Mar 28