FairEval / README.md
kriti0608's picture
Update README.md
982012d verified
# FairEval: Human-Aligned Evaluation for Generative Models
**Author:** Kriti Behl
**GitHub:** https://github.com/kritibehl/FairEval
**Paper (preprint):** _“FairEval: Human-Aligned Evaluation for Generative Models”_
FairEval is a lightweight research framework for evaluating LLM outputs beyond accuracy — focusing on:
- **LLM-as-Judge alignment scoring**
- **Toxicity / safety analysis**
- **Human agreement metrics (κ, ρ)**
- **Group-wise fairness dashboards**
It is designed as a **research tool**, not a deployment model.
---
## What this repo contains
This Hugging Face repo currently serves as a **model card + metadata hub** for:
- The **FairEval evaluation pipeline** (code on GitHub)
- A planned **Hugging Face Space demo** (UI built in Streamlit)
- Links to my **preprint** and **Medium explainer**.
> **Code**: https://github.com/kritibehl/FairEval
> **Medium**: https://medium.com/@kriti0608/faireval-a-human-aligned-evaluation-framework-for-generative-models-d822bfd5c99d
---
## Capabilities
FairEval supports:
1. **Rubric-based LLM-as-Judge scoring**
- Uses a structured rubric (`config/prompts/judge_rubric.md`) to score:
- coherence
- helpfulness
- factuality
- Returns **scalar scores** that correlate with human preference.
2. **Toxicity and safety metrics**
- Wraps a toxicity model (e.g., Detoxify) to compute:
- composite toxicity
- per-category scores (insult, threat, identity attack, etc.)
- Provides **Altair charts** for:
- toxicity breakdown by category
- toxicity distribution by demographic group
3. **Human evaluation agreement**
- Ingests a `human_eval.csv` file with human ratings.
- Computes:
- **Fleiss’ κ** (inter-rater reliability)
- **Spearman ρ** between judge and human scores.
---
## Example Usage
Checkout the GitHub repo and run the Streamlit demo:
```bash
git clone https://github.com/kritibehl/FairEval.git
cd FairEval
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
streamlit run demo/app.py