feat: implement Unsloth GRPO training script with diverse reward functions and logging d2449aa adityss commited on 24 days ago
Merge branch 'main' of https://github.com/LO-Kyu/gridmind c2917d2 ShreeshantXD commited on 24 days ago
fix: update training script with seed variation, fix reward normalization, regenerate training curves showing 0.52->0.67 improvement bdc9954 adityss commited on 24 days ago
feat: add scripts/full_demo.py — unified 10-step demo proving all 4 hackathon themes operational 5636c9d adityss commited on 24 days ago
fix: training reward uses 8-step rollout + /grade for genuine episode-level signal c70e17d adityss commited on 24 days ago
feat: commit training evidence, update README with real scores, add demo scripts 8204dc0 adityss commited on 24 days ago
feat: add baseline evaluation tools and demo scripts for RL performance comparison c395f6a adityss commited on 24 days ago
feat: add GridMind GRPO training notebook using Unsloth and HF TRL bdadba1 adityss commited on 24 days ago
feat: implement Go-based GridMind-RL simulation core and update inference interface (graph) a4671c4 Prajwal782007 commited on 24 days ago
feat: add multi-agent and planning CLI flags to inference and expose environment metadata via /info endpoint ebe8fa5 adityss commited on 24 days ago
feat: define GridMind-RL environment data models and task structures c009bc5 adityss commited on 24 days ago
feat: implement multi-component dense reward function and environmental logic for GridMind-RL b81683f adityss commited on 24 days ago
Add Task 4 instruction following, Curriculum Manager for self-improvement, and world modeling simulation 0af208b adityss commited on 27 days ago
fix: Replace ineffective break with return in WebSocket close handler d012f99 ShreeshantXD commited on 27 days ago
fix: introduce SCORE_EPSILON and clamp scores in run_episode and main functions b93cee3 adityss commited on Apr 7
fix: clamp scores after rounding and ensure all sub-scores are clamped e58b5ec ShreeshantXD commited on Apr 7
fix: clamp all scores to open interval (0, 1) to meet validator requirements ef0556b ShreeshantXD commited on Apr 7
fix: use golang:1.21 instead of alpine for better Docker registry compatibility 287d2a3 ShreeshantXD commited on Apr 7
refactor: update default model and API endpoint, enhance error handling, and add close method for compatibility 891cc5b adityss commited on Apr 7
fix: provide fallback API key and add safety check for empty observations in inference client fe2f8c9 adityss commited on Apr 7
Fix inference.py: handle missing API key gracefully, wrap all exceptions 9fd03cb ShreeshantXD commited on Apr 7
fix: add server entry point and pyproject scripts for OpenEnv validator 91cc891 ShreeshantXD commited on Apr 6
Add root landing page handler with links to dashboard and API endpoints 90c0e10 ShreeshantXD commited on Apr 5