Spaces:

LaelaZ
/

ai-evaluation-toolkit

Runtime error

fix short_description length

39731fd 13 days ago

801 Bytes

	---
	title: AI Evaluation Toolkit
	emoji: 🎯
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: 4.44.0
	app_file: app.py
	pinned: true
	short_description: RLHF rating, content policy scoring, obs/inference
	---

	# AI Evaluation Toolkit

	Interactive demos of the AI training data quality control workflows from [github.com/LaelaZorana](https://github.com/LaelaZorana).

	Three tools:
	1. RLHF Pairwise Rater — Rate AI responses on 4 axes with self-consistency check
	2. Content Policy Rater — Score text against a policy rubric with per-criterion reasoning
	3. Observation vs Inference — Practice keeping observations clean of conclusions

	Built by [Laela Zorana](https://github.com/LaelaZorana) \| [HuggingFace](https://huggingface.co/LaelaZ) \| [Kaggle](https://kaggle.com/laelazorana)