Commit History

refining dict-typing in inference
0baed2c

Naseer-010 commited on

adding unified inference script
763c746

Naseer-010 commited on

added reward function normalization, hidden eval set
15673b8

Naseer-010 commited on

added canonical evaluation harness, unified DIME index, deterministic replay guarantees
54da37b

Naseer-010 commited on

Update codebase and ignore DIME.pdf and image_gen binaries
c1ba9ce

Naseer-010 commited on

restructring
e67cb1a

Naseer-010 commited on

readme updated
923d2f3

Naseer-010 commited on

frontend bug fixes
82151a2

Naseer-010 commited on

slamm bug fix
1f7679c

Naseer-010 commited on

fixing frontend
f1982b6

Naseer-010 commited on

added frontend and stimulation
32f20b5

shivi3743892y8 commited on

fixing the reward tirage
09fcbfb

Naseer-010 commited on

Finetuning done GRPO
b54ab02

iimnithish commited on

enhancing the reward function and inference
aed8337

Naseer-010 commited on

Finalized bulletproof inference.py with local/endpoint toggle
619a9cb

Naseer-010 commited on

Train complete
37b4a1d

iimnithish commited on

Train going good
015f7f7

iimnithish commited on

Environment logs updated
0a818f5

iimnithish commited on

Incerence done
aba3dc2

iimnithish commited on

Updated unsloth train based on naseers improvements
839d170

iimnithish commited on

The train has to be started on another machine
97c4b95

iimnithish commited on

Naseer said so
cc7dd29

iimnithish commited on

Rewards learning I guess
c973d65

iimnithish commited on

enhancing the reward function and inference
8824d64

Naseer-010 commited on

Finalized bulletproof inference.py with local/endpoint toggle
ccdf967

Naseer-010 commited on

Inferencing improved, but not ready for GRPO
017c68a

iimnithish commited on

Added new tasks
6f6185f

Nithish Sri Ram commited on

Deleted previous html
5a3731b

Naseer-010 commited on

added links
05ef187

Naseer-010 commited on

hf configuredd
c5e0cb4

Naseer-010 commited on

Update codebase and ignore DIME.pdf and image_gen binaries
facabc7

Naseer-010 commited on

chore: update codebase and fix gitignore
6a2eb8a

Naseer-010 commited on

updating inference
d9ea08b

Naseer-010 commited on

restructring logs
6ae496c

Naseer-010 commited on

structured logs stroing
1f56aa6

Naseer-010 commited on

storing the logs
07f12a4

Naseer-010 commited on

refined the reward algos
402a30e

Naseer-010 commited on

updating reward function from step to gradient
8ff9fe4

Naseer-010 commited on

added alibaba trace data
2ba6413

Naseer-010 commited on

adding ruberics,updating tasks,validators
f51115b

Naseer-010 commited on