Spaces:

Duplicated from ayaan-ai/trainer

Hollow-Abyss
/

trainer

Paused

App Files Files Community

trainer / app.py

Commit History

Fix CUDA device-side assert by adjusting max_prompt_length and disabling use_cache

fe123ff

mayank1365 commited on Apr 26

Fix IndentationError and duplicate reward logic

0d40379

mayank1365 commited on Apr 26

Fix dtype mismatch in training and update blog

463c260

mayank1365 commited on Apr 26

fix: resolve tensor size mismatch in reward_function by indexing correctly

134ff83

mayank1365 commited on Apr 26

fix: resolve Gradio JSON error by passing dicts directly instead of serialized strings

731ed55

mayank1365 commited on Apr 26

fix: resolve 422 errors by sanitizing JSON and fix python syntax error

117c380

mayank1365 commited on Apr 26

fix: improve GRPO learning signal and handle 422 environment errors

f5df9dc

mayank1365 commited on Apr 26

feat: randomize number of facts (2-4) per episode for better training diversity

b676c5d

mayank1365 commited on Apr 26

fix: address reward plateau by adding format rewards and improving GRPO logic

b588360

mayank1365 commited on Apr 26

fix: resolve GPU detection issue on HF Spaces

ef2c4bf

mayank1365 commited on Apr 26

Update app.py

69d4d30
verified

ayaan-ai commited on Apr 25

Rename ap.py to app.py

01057a2
verified

ayaan-ai commited on Apr 25