Spaces:

jester1177
/

mutant-hunter-env

Sleeping

App Files Files Community

mutant-hunter-env / training

89 kB

Ctrl+K

Ctrl+K

3 contributors

History: 10 commits

Krishna1107's picture

Drop GRPO temp to 0.3, bump max_new_tokens to 2048, add inference smoke test

576dfc3 about 1 month ago

data
Add in-context demonstration learning support about 1 month ago
__init__.py

41 Bytes
Initial commit: MutantHunter — RL env for mutation-score-rewarded test generation about 1 month ago
baseline_eval.py

6.53 kB
Add HF Job training pipeline: persistence-aware run script, judge-facing demo notebook, baseline JSON output about 1 month ago
mine_demonstrations.py

19.6 kB
Add in-context demonstration learning support about 1 month ago
prompts.py

13.6 kB
Prompt fix: include full module source + grounding rule + corpus example; skip baseline recompute when cached about 1 month ago
smoke_grpo_inference.py

8.48 kB
Drop GRPO temp to 0.3, bump max_new_tokens to 2048, add inference smoke test about 1 month ago
smoke_reward_fn.py

3.55 kB
Fix GRPO reward routing: correct seed lookup + markdown fence stripping about 1 month ago
train_grpo.ipynb

7.94 kB
Fix verification gaps: add README links, rename blog, fix Colab badge, set author about 1 month ago
train_grpo.py

16.4 kB
Drop GRPO temp to 0.3, bump max_new_tokens to 2048, add inference smoke test about 1 month ago