Spaces:

hitanshjain1812
/

meta_final_model

Sleeping

App Files Files Community

meta_final_model / VALIDATION_CHECKLIST.md

hitanshjain1812's picture

hitanshjain1812

Add Colab GRPO training pipeline, docs, and inference robustness fixes

056a7b3 about 1 month ago

|

history blame contribute delete

1.26 kB

Validation Checklist

Mandatory Hackathon Checks

OpenEnv Environment

openenv.yaml is valid
Environment starts via Docker
Required endpoints work: /reset, /step, /state, /tasks, /health

Inference Reproducibility

python inference.py runs end-to-end
Output format uses [START], [STEP], [END]

RL Training Pipeline (TRL/Unsloth)

Colab notebook runs: colab/PR_Review_GRPO_Training.ipynb
python train_grpo.py ... runs without API errors
Reward logs are produced
Reward curve image is produced
Before/after score table is produced

Training Artifacts

artifacts/<run>/logs/reward_history.csv
artifacts/<run>/logs/training_summary.json
artifacts/<run>/logs/before_after.md
artifacts/<run>/plots/reward_curve.png

Storytelling Requirements

README explains problem, environment, rewards, and results
README links to HF Space
README links to mini-blog or <2 min video

Quick Command Flow

docker build -t pr-review-env .
docker run --rm -p 7860:7860 pr-review-env
python inference.py
python train_grpo.py --env-base-url http://127.0.0.1:7860 --num-train-epochs 1 --output-dir artifacts/grpo_run