Spaces:

LaelaZ
/

ai-evaluation-toolkit

Runtime error

App Files Files Community

ai-evaluation-toolkit / README.md

LaelaZ

fix short_description length

39731fd 13 days ago

preview code

raw

history blame contribute delete

801 Bytes

A newer version of the Gradio SDK is available: 6.17.3

Upgrade

metadata

title: AI Evaluation Toolkit
emoji: 🎯
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: true
short_description: RLHF rating, content policy scoring, obs/inference

AI Evaluation Toolkit

Interactive demos of the AI training data quality control workflows from github.com/LaelaZorana.

Three tools:

RLHF Pairwise Rater — Rate AI responses on 4 axes with self-consistency check
Content Policy Rater — Score text against a policy rubric with per-criterion reasoning
Observation vs Inference — Practice keeping observations clean of conclusions

Built by Laela Zorana | HuggingFace | Kaggle