Commit History

chore: remove unused configuration file
60eb0d6

ayhm23 commited on

Phase 3: Stabilized GRPO training and fixed model collapse. Reduced LR to 5e-7, added KL penalty (beta=0.04), and implemented English coherence guard. Final evaluation shows 90%+ success in social engineering refusals.
bf3dcd6

ayhm23 commited on

changes i dont know
5778e7e

sanyamvermaa commited on

feat(env): scaffold OpenEnv environment - Person A Phase 1
a1288df

sanyamvermaa commited on

directory structure setup
0f6b81e

Puskara commited on