Spaces:

Uddiii
/

Multi-Agentic

Running

App Files Files Community

Multi-Agentic / kaggle

Commit History

feat: add support for lowercase Hugging Face Space secrets

63726b6

Uddiii commited on 27 days ago

Submission-ready: README, blog, training pipeline, baseline evidence, OpenEnv compliance

7a90355

Uddiii commited on 27 days ago

fix(grpo): Unsloth inference-mode swap + smaller LoRA + KV cleanup (T4 OOM #2)

a3804d9

Uddiii commited on 27 days ago

fix(grpo): per-step backward to bound VRAM during update (T4 OOM fix)

8f20926

Uddiii commited on 27 days ago

feat(kaggle): add clean_launch.py + shrink budget to 20/25/30 = 75 eps

cd923aa

Uddiii commited on 27 days ago

feat(kaggle): default to fixed-budget curriculum 20/30/50 episodes

69f89ec

Uddiii commited on 27 days ago

fix(grpo): skip reference model when kl_beta=0 to save 5GB VRAM on T4

0566783

Uddiii commited on 27 days ago

fix(kaggle): align pip-managed numpy with kernel's loaded numpy

27cf9cd

Uddiii commited on 27 days ago

fix(kaggle): pin torch via constraints file in REPAIR cell

112679c

Uddiii commited on 27 days ago

fix(kaggle): escape backslash-n in REPAIR cell separator print

04688c1

Uddiii commited on 27 days ago

chore(kaggle): rebuild notebook v3 + clean dev-scratch files

2df5c63

Uddiii commited on 27 days ago

fix(kaggle): unpin torch and loosen trl floor to prevent bnb/unsloth break

71a0a91

Uddiii commited on 28 days ago

kaggle: unpin unsloth, install matched zoo pair

2e98419

Uddiii commited on 28 days ago

kaggle: refresh cell-8 promotion timing for per-phase early-stop

c64ec55

Uddiii commited on 28 days ago

train: per-phase reward thresholds for early-stop

d490143

Uddiii commited on 28 days ago

kaggle: lower convergence bar to +1.2 reward (3.1x baseline P3)

13ae8dd

Uddiii commited on 28 days ago

kaggle: route Patient + Nurse to 8B-instant pool

d8c3b18

Uddiii commited on 28 days ago

kaggle: 8B Doctor + train-until-optimal early-stop

9c68ba6

Uddiii commited on 28 days ago

Commit History

feat: add support for lowercase Hugging Face Space secrets 63726b6

Submission-ready: README, blog, training pipeline, baseline evidence, OpenEnv compliance 7a90355

fix(grpo): Unsloth inference-mode swap + smaller LoRA + KV cleanup (T4 OOM #2) a3804d9

fix(grpo): per-step backward to bound VRAM during update (T4 OOM fix) 8f20926

feat(kaggle): add clean_launch.py + shrink budget to 20/25/30 = 75 eps cd923aa

feat(kaggle): default to fixed-budget curriculum 20/30/50 episodes 69f89ec

fix(grpo): skip reference model when kl_beta=0 to save 5GB VRAM on T4 0566783

fix(kaggle): align pip-managed numpy with kernel's loaded numpy 27cf9cd

fix(kaggle): pin torch via constraints file in REPAIR cell 112679c

fix(kaggle): escape backslash-n in REPAIR cell separator print 04688c1

chore(kaggle): rebuild notebook v3 + clean dev-scratch files 2df5c63

fix(kaggle): unpin torch and loosen trl floor to prevent bnb/unsloth break 71a0a91

kaggle: unpin unsloth, install matched zoo pair 2e98419

kaggle: refresh cell-8 promotion timing for per-phase early-stop c64ec55

train: per-phase reward thresholds for early-stop d490143

kaggle: lower convergence bar to +1.2 reward (3.1x baseline P3) 13ae8dd

kaggle: route Patient + Nurse to 8B-instant pool d8c3b18

kaggle: 8B Doctor + train-until-optimal early-stop 9c68ba6

feat: add support for lowercase Hugging Face Space secrets

63726b6

Submission-ready: README, blog, training pipeline, baseline evidence, OpenEnv compliance

7a90355

fix(grpo): Unsloth inference-mode swap + smaller LoRA + KV cleanup (T4 OOM #2)

a3804d9

fix(grpo): per-step backward to bound VRAM during update (T4 OOM fix)

8f20926

feat(kaggle): add clean_launch.py + shrink budget to 20/25/30 = 75 eps

cd923aa

feat(kaggle): default to fixed-budget curriculum 20/30/50 episodes

69f89ec

fix(grpo): skip reference model when kl_beta=0 to save 5GB VRAM on T4

0566783

fix(kaggle): align pip-managed numpy with kernel's loaded numpy

27cf9cd

fix(kaggle): pin torch via constraints file in REPAIR cell

112679c

fix(kaggle): escape backslash-n in REPAIR cell separator print

04688c1

chore(kaggle): rebuild notebook v3 + clean dev-scratch files

2df5c63

fix(kaggle): unpin torch and loosen trl floor to prevent bnb/unsloth break

71a0a91

kaggle: unpin unsloth, install matched zoo pair

2e98419

kaggle: refresh cell-8 promotion timing for per-phase early-stop

c64ec55

train: per-phase reward thresholds for early-stop

d490143

kaggle: lower convergence bar to +1.2 reward (3.1x baseline P3)

13ae8dd

kaggle: route Patient + Nurse to 8B-instant pool

d8c3b18

kaggle: 8B Doctor + train-until-optimal early-stop

9c68ba6