Spaces:
Configuration error
Configuration error
Commit History
adding unified inference script 763c746
added reward function normalization, hidden eval set 15673b8
added canonical evaluation harness, unified DIME index, deterministic replay guarantees 54da37b
Update codebase and ignore DIME.pdf and image_gen binaries c1ba9ce
restructring e67cb1a
add 519c815
centeralised things fda958c
added ui img 07a475c
readme updated 923d2f3
added favicon d4e6b57
added readme 3bc2a7b
frontend bug fixes 82151a2
final updates on ui 66aee67
slamm bug fix 1f7679c
fixing frontend f1982b6
added frontend and stimulation 32f20b5
fixing the reward tirage 09fcbfb
Finetuning done GRPO b54ab02
enhancing the reward function and inference aed8337
updated the train py d146d4a
added train.py c7af6db
Finalized bulletproof inference.py with local/endpoint toggle 619a9cb
Train complete 37b4a1d
Train going good 015f7f7
Environment logs updated 0a818f5
Incerence done aba3dc2
Updated unsloth train based on naseers improvements 839d170
The train has to be started on another machine 97c4b95
Naseer said so cc7dd29
Rewards learning I guess c973d65
enhancing the reward function and inference 8824d64
updated the train py d674ac5
added train.py 3b828fc
Finalized bulletproof inference.py with local/endpoint toggle ccdf967
Inferencing improved, but not ready for GRPO 017c68a
Added new tasks 6f6185f
Nithish Sri Ram commited on