fix: clamp reward to [0.01,0.99] so .2f never rounds to 0.00 or 1.00 59fd9d3 havinashpatil commited on 13 days ago
Complete all tasks: Adaptive curriculum, GRPO, React frontend, LLM-as-a-judge a448db8 havinashpatil commited on 13 days ago
fix: reset task_id parsing, grader tuple crash fallback, and inference score output 646409d adityanaikhpt commited on about 1 month ago
Rewrite inference.py for strict OpenEnv parsing + add httpx eb60bd2 adityanaikhpt commited on about 1 month ago
Minimal patch: standalone proxy ping + reward clamped to (0,1) 74bfde0 adityanaikhpt commited on about 1 month ago
fix: use API_BASE_URL/API_KEY for LiteLLM proxy — always make API call (Phase 2) 51fdbe8 adityanaikhpt commited on about 1 month ago
fix: make inference.py crash-proof when OPENAI_API_KEY is missing (Phase 2) 1fe26af adityanaikhpt commited on about 1 month ago