Spaces:

Naseer-010
/

DIME

Configuration error

App Files Files Community

DIME

Commit History

refining dict-typing in inference

0baed2c

Naseer-010 commited on 2 days ago

adding unified inference script

763c746

Naseer-010 commited on 2 days ago

added reward function normalization, hidden eval set

15673b8

Naseer-010 commited on 11 days ago

added canonical evaluation harness, unified DIME index, deterministic replay guarantees

54da37b

Naseer-010 commited on 11 days ago

Update codebase and ignore DIME.pdf and image_gen binaries

c1ba9ce

Naseer-010 commited on 23 days ago

restructring

e67cb1a

Naseer-010 commited on 23 days ago

add

519c815

shivi3743892y8 commited on 23 days ago

centeralised things

fda958c

shivi3743892y8 commited on 23 days ago

added ui img

07a475c

shivi3743892y8 commited on 23 days ago

readme updated

923d2f3

Naseer-010 commited on 23 days ago

added favicon

d4e6b57

shivi3743892y8 commited on 23 days ago

added readme

3bc2a7b

shivi3743892y8 commited on 23 days ago

frontend bug fixes

82151a2

Naseer-010 commited on 23 days ago

final updates on ui

66aee67

shivi3743892y8 commited on 23 days ago

slamm bug fix

1f7679c

Naseer-010 commited on 23 days ago

fixing frontend

f1982b6

Naseer-010 commited on 23 days ago

added frontend and stimulation

32f20b5

shivi3743892y8 commited on 23 days ago

fixing the reward tirage

09fcbfb

Naseer-010 commited on 23 days ago

Finetuning done GRPO

b54ab02

iimnithish commited on 23 days ago

enhancing the reward function and inference

aed8337

Naseer-010 commited on 23 days ago

updated the train py

d146d4a

shivi3743892y8 commited on 23 days ago

added train.py

c7af6db

shivi3743892y8 commited on 23 days ago

Finalized bulletproof inference.py with local/endpoint toggle

619a9cb

Naseer-010 commited on 23 days ago

Train complete

37b4a1d

iimnithish commited on 23 days ago

Train going good

015f7f7

iimnithish commited on 23 days ago

Environment logs updated

0a818f5

iimnithish commited on 23 days ago

Incerence done

aba3dc2

iimnithish commited on 23 days ago

Updated unsloth train based on naseers improvements

839d170

iimnithish commited on 23 days ago

The train has to be started on another machine

97c4b95

iimnithish commited on 23 days ago

Naseer said so

cc7dd29

iimnithish commited on 23 days ago

Rewards learning I guess

c973d65

iimnithish commited on 23 days ago

enhancing the reward function and inference

8824d64

Naseer-010 commited on 23 days ago

updated the train py

d674ac5

shivi3743892y8 commited on 23 days ago

added train.py

3b828fc

shivi3743892y8 commited on 23 days ago

Finalized bulletproof inference.py with local/endpoint toggle

ccdf967

Naseer-010 commited on 23 days ago

Inferencing improved, but not ready for GRPO

017c68a

iimnithish commited on 23 days ago

Added new tasks

6f6185f

Nithish Sri Ram commited on 24 days ago

Deleted previous html

5a3731b

Naseer-010 commited on 23 days ago

added links

05ef187

Naseer-010 commited on 23 days ago

hf configuredd

c5e0cb4

Naseer-010 commited on 23 days ago

Update codebase and ignore DIME.pdf and image_gen binaries

facabc7

Naseer-010 commited on 23 days ago

chore: update codebase and fix gitignore

6a2eb8a

Naseer-010 commited on 24 days ago

updating inference

d9ea08b

Naseer-010 commited on 24 days ago

restructring logs

6ae496c

Naseer-010 commited on 24 days ago

structured logs stroing

1f56aa6

Naseer-010 commited on 24 days ago

storing the logs

07f12a4

Naseer-010 commited on 24 days ago

refined the reward algos

402a30e

Naseer-010 commited on 24 days ago

updating reward function from step to gradient

8ff9fe4

Naseer-010 commited on 24 days ago

added alibaba trace data

2ba6413

Naseer-010 commited on 24 days ago

adding ruberics,updating tasks,validators

f51115b

Naseer-010 commited on 24 days ago

Commit History

refining dict-typing in inference 0baed2c

adding unified inference script 763c746

added reward function normalization, hidden eval set 15673b8

added canonical evaluation harness, unified DIME index, deterministic replay guarantees 54da37b

Update codebase and ignore DIME.pdf and image_gen binaries c1ba9ce

restructring e67cb1a

add 519c815

centeralised things fda958c

added ui img 07a475c

readme updated 923d2f3

added favicon d4e6b57

added readme 3bc2a7b

frontend bug fixes 82151a2

final updates on ui 66aee67

slamm bug fix 1f7679c

fixing frontend f1982b6

added frontend and stimulation 32f20b5

fixing the reward tirage 09fcbfb

Finetuning done GRPO b54ab02

enhancing the reward function and inference aed8337

updated the train py d146d4a

added train.py c7af6db

Finalized bulletproof inference.py with local/endpoint toggle 619a9cb

Train complete 37b4a1d

Train going good 015f7f7

Environment logs updated 0a818f5

Incerence done aba3dc2

Updated unsloth train based on naseers improvements 839d170

The train has to be started on another machine 97c4b95

Naseer said so cc7dd29

Rewards learning I guess c973d65

enhancing the reward function and inference 8824d64

updated the train py d674ac5

added train.py 3b828fc

Finalized bulletproof inference.py with local/endpoint toggle ccdf967

Inferencing improved, but not ready for GRPO 017c68a

Added new tasks 6f6185f

Deleted previous html 5a3731b

added links 05ef187

hf configuredd c5e0cb4

Update codebase and ignore DIME.pdf and image_gen binaries facabc7

chore: update codebase and fix gitignore 6a2eb8a

updating inference d9ea08b

restructring logs 6ae496c

structured logs stroing 1f56aa6

storing the logs 07f12a4

refined the reward algos 402a30e

updating reward function from step to gradient 8ff9fe4

added alibaba trace data 2ba6413

adding ruberics,updating tasks,validators f51115b

refining dict-typing in inference

0baed2c

adding unified inference script

763c746

added reward function normalization, hidden eval set

15673b8

added canonical evaluation harness, unified DIME index, deterministic replay guarantees

54da37b

Update codebase and ignore DIME.pdf and image_gen binaries

c1ba9ce

restructring

e67cb1a

add

519c815

centeralised things

fda958c

added ui img

07a475c

readme updated

923d2f3

added favicon

d4e6b57

added readme

3bc2a7b

frontend bug fixes

82151a2

final updates on ui

66aee67

slamm bug fix

1f7679c

fixing frontend

f1982b6

added frontend and stimulation

32f20b5

fixing the reward tirage

09fcbfb

Finetuning done GRPO

b54ab02

enhancing the reward function and inference

aed8337

updated the train py

d146d4a

added train.py

c7af6db

Finalized bulletproof inference.py with local/endpoint toggle

619a9cb

Train complete

37b4a1d

Train going good

015f7f7

Environment logs updated

0a818f5

Incerence done

aba3dc2

Updated unsloth train based on naseers improvements

839d170

The train has to be started on another machine

97c4b95

Naseer said so

cc7dd29

Rewards learning I guess

c973d65

enhancing the reward function and inference

8824d64

updated the train py

d674ac5

added train.py

3b828fc

Finalized bulletproof inference.py with local/endpoint toggle

ccdf967

Inferencing improved, but not ready for GRPO

017c68a

Added new tasks

6f6185f

Deleted previous html

5a3731b

added links

05ef187

hf configuredd

c5e0cb4

Update codebase and ignore DIME.pdf and image_gen binaries

facabc7

chore: update codebase and fix gitignore

6a2eb8a

updating inference

d9ea08b

restructring logs

6ae496c

structured logs stroing

1f56aa6

storing the logs

07f12a4

refined the reward algos

402a30e

updating reward function from step to gradient

8ff9fe4

added alibaba trace data

2ba6413

adding ruberics,updating tasks,validators

f51115b