AntiAtropos / training /openenv_loop.py

Commit History

prompt fixes
8c4ef5c

div18 commited on

prompt changes
8425a53

div18 commited on

prompts
c157063

div18 commited on

fix CPU tensors
e408dca

div18 commited on

fix bug
69f37e9

div18 commited on

reward etc tuning
67810ba

div18 commited on

env changes
70cdeae

div18 commited on

training changes
7dbb622

div18 commited on

entropy spread
d23c9c4

div18 commited on

OOM
0f6141d

div18 commited on

fixes
d222529

div18 commited on

fixes
46cd5c4

div18 commited on

fix backprop
871c1ae

div18 commited on

changes
d41d25d

div18 commited on

fix structure
fa8da3f

div18 commited on

edits
619e74d

div18 commited on

fix: disable Qwen thinking at Jinja template level with enable_thinking=False
5edb1ce

div18 commited on

fix: disable Qwen thinking at Jinja template level with enable_thinking=False
65788cc

div18 commited on

fix: add /no_think to system prompts, strip TRACE blocks from model output
9fd06fa

div18 commited on

fixes
75c8df1

div18 commited on

code
e890160

div18 commited on