kriti0608
/

FairEval

Model card Files Files and versions

kriti0608 commited on Nov 15, 2025

Commit

982012d

·

verified ·

1 Parent(s): ef350e1

Update README.md

Files changed (1) hide show

README.md +61 -7

README.md CHANGED Viewed

@@ -1,13 +1,67 @@
-FairEval is a human-aligned evaluation framework designed to measure fairness, toxicity, and alignment of generative models using:
-LLM-as-Judge scoring
-Toxicity + fairness analysis (Detoxify, per-category charts)
-Human agreement metrics (κ, ρ)
-Group-based fairness dashboard
-Model-level SQuAD EM/F1 & uncertainty
-This framework is built for rigorous Responsible AI analysis inspired by real-world industry evaluation pipelines.

+# FairEval: Human-Aligned Evaluation for Generative Models
+**Author:** Kriti Behl
+**GitHub:** https://github.com/kritibehl/FairEval
+**Paper (preprint):** _“FairEval: Human-Aligned Evaluation for Generative Models”_
+FairEval is a lightweight research framework for evaluating LLM outputs beyond accuracy — focusing on:
+- **LLM-as-Judge alignment scoring**
+- **Toxicity / safety analysis**
+- **Human agreement metrics (κ, ρ)**
+- **Group-wise fairness dashboards**
+It is designed as a **research tool**, not a deployment model.
+---
+## What this repo contains
+This Hugging Face repo currently serves as a **model card + metadata hub** for:
+- The **FairEval evaluation pipeline** (code on GitHub)
+- A planned **Hugging Face Space demo** (UI built in Streamlit)
+- Links to my **preprint** and **Medium explainer**.
+> **Code**: https://github.com/kritibehl/FairEval
+> **Medium**: https://medium.com/@kriti0608/faireval-a-human-aligned-evaluation-framework-for-generative-models-d822bfd5c99d
+---
+## Capabilities
+FairEval supports:
+1. **Rubric-based LLM-as-Judge scoring**
+   - Uses a structured rubric (`config/prompts/judge_rubric.md`) to score:
+     - coherence
+     - helpfulness
+     - factuality
+   - Returns **scalar scores** that correlate with human preference.
+2. **Toxicity and safety metrics**
+   - Wraps a toxicity model (e.g., Detoxify) to compute:
+     - composite toxicity
+     - per-category scores (insult, threat, identity attack, etc.)
+   - Provides **Altair charts** for:
+     - toxicity breakdown by category
+     - toxicity distribution by demographic group
+3. **Human evaluation agreement**
+   - Ingests a `human_eval.csv` file with human ratings.
+   - Computes:
+     - **Fleiss’ κ** (inter-rater reliability)
+     - **Spearman ρ** between judge and human scores.
+---
+## Example Usage
+Checkout the GitHub repo and run the Streamlit demo:
+```bash
+git clone https://github.com/kritibehl/FairEval.git
+cd FairEval
+python3 -m venv .venv && source .venv/bin/activate
+pip install -r requirements.txt
+streamlit run demo/app.py