v2.0: multi-step episodes, procedural bugs, semantic grading, sessions, 71 tests 703aa57 Siteshcodes commited on Apr 12
fix: add name/difficulty to tasks, per-task [START]/[END] logs for validator 7eb0325 Siteshcodes commited on Apr 10
fix: reward_range changed to (0.05, 0.95) to satisfy strict bounds 19cec11 Siteshcodes commited on Apr 9