Spaces:
Runtime error
Runtime error
A newer version of the Gradio SDK is available: 6.17.3
metadata
title: AI Evaluation Toolkit
emoji: 🎯
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: true
short_description: RLHF rating, content policy scoring, obs/inference
AI Evaluation Toolkit
Interactive demos of the AI training data quality control workflows from github.com/LaelaZorana.
Three tools:
- RLHF Pairwise Rater — Rate AI responses on 4 axes with self-consistency check
- Content Policy Rater — Score text against a policy rubric with per-criterion reasoning
- Observation vs Inference — Practice keeping observations clean of conclusions
Built by Laela Zorana | HuggingFace | Kaggle