fix: clamp all task scores to strict (0,1) range - never 0.0 or 1.0 - grader.py: add _clamp() helper, apply to all sub-scores and total - models.py: update Pydantic fields to gt/lt strict bounds - inference.py: clamp avg_reward and final_score Fixes Phase 2 'task scores out of range' validation error