feat: add GridMind-RL GRPO training notebook for industrial energy management 87ce30f adityss commited on 23 days ago
feat: implement Unsloth GRPO training pipeline with environment-backed reward functions and balanced dataset generation 27d3504 adityss commited on 23 days ago
fix: disable AMP for quantized models to avoid gradient scaler issues in GRPO training 19ba2eb adityss commited on 23 days ago
feat: update GRPO training configuration with additional parameters for logging and precision f3ecc94 adityss commited on 23 days ago
Adjust logging configuration for training: log every step, enable completion metrics, and limit completions printed per step. a6b45e9 adityss commited on 23 days ago
feat: add GridMind GRPO training notebook for multi-theme reinforcement learning 999605c Prajwal782007 commited on 23 days ago
feat: add GRPO training notebook for GridMind-RL environment 505323f Prajwal782007 commited on 23 days ago
feat: add GridMind GRPO training notebook for Meta PyTorch OpenEnv hackathon 29b9cd0 Prajwal782007 commited on 23 days ago
feat: add GridMind GRPO training notebook for industrial energy management environment 9d42d14 Prajwal782007 commited on 23 days ago
feat: implement GridMind-RL training pipeline with GRPO Colab notebook and Unsloth configuration script b0701ef Prajwal782007 commited on 23 days ago
feat: implement Unsloth GRPO training script with environment-based reward tracking and balanced dataset generation 32d5b8f Prajwal782007 commited on 23 days ago
feat: add GRPO training pipeline for GridMind-RL environment via Unsloth and TRL 26e9b86 Prajwal782007 commited on 23 days ago
feat: add submission validator script and GRPO training notebook, and update Python version requirement to >=3.10 7d89faf Prajwal782007 commited on 23 days ago
feat: add GridMind GRPO training environment and Unsloth training script 3d49e8a Prajwal782007 commited on 23 days ago
feat: add script to migrate max_new_tokens from GRPOConfig to GRPOTrainer in notebook 08731ee Prajwal782007 commited on 23 days ago
fix: change tokenizer to processing_class in GRPOTrainer acabf6c Prajwal782007 commited on 23 days ago
fix: order imports in Step 1 and add missing torch import in Step 7 c220c03 Prajwal782007 commited on 23 days ago
fix: move max_new_tokens from GRPOConfig to GRPOTrainer generation_kwargs dc14955 Prajwal782007 commited on 23 days ago
fix: enforce GPU usage and assertions in colab notebook 4738130 Prajwal782007 commited on 23 days ago
fix: update dependencies in colab notebook for GRPOTrainer 7597057 Prajwal782007 commited on 23 days ago
feat: update HF space URL, add judge demo scripts and project documentation a4bc605 Prajwal782007 commited on 23 days ago
fix: update health check endpoint in GridMind notebook and provide utility script to apply fix 18750f8 Prajwal782007 commited on 23 days ago
Add coordinator endpoint tests and project readiness verification script 88da572 adityss commited on 23 days ago
fix: update training script with seed variation, fix reward normalization, regenerate training curves showing 0.52->0.67 improvement bdc9954 adityss commited on 23 days ago
feat: add scripts/full_demo.py — unified 10-step demo proving all 4 hackathon themes operational 5636c9d adityss commited on 24 days ago
fix: training reward uses 8-step rollout + /grade for genuine episode-level signal c70e17d adityss commited on 24 days ago
feat: commit training evidence, update README with real scores, add demo scripts 8204dc0 adityss commited on 24 days ago
feat: add baseline evaluation tools and demo scripts for RL performance comparison c395f6a adityss commited on 24 days ago
feat: add GridMind GRPO training notebook using Unsloth and HF TRL bdadba1 adityss commited on 24 days ago
Add Task 4 instruction following, Curriculum Manager for self-improvement, and world modeling simulation 0af208b adityss commited on 26 days ago
feat: add OpenEnv submission validator script to check HF Space status and Docker build e82aa27 adityss commited on Apr 5