kriti0608 commited on
Commit
982012d
·
verified ·
1 Parent(s): ef350e1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -7
README.md CHANGED
@@ -1,13 +1,67 @@
1
- FairEval is a human-aligned evaluation framework designed to measure fairness, toxicity, and alignment of generative models using:
2
 
3
- LLM-as-Judge scoring
 
 
4
 
5
- Toxicity + fairness analysis (Detoxify, per-category charts)
6
 
7
- Human agreement metrics (κ, ρ)
 
 
 
8
 
9
- Group-based fairness dashboard
10
 
11
- Model-level SQuAD EM/F1 & uncertainty
12
 
13
- This framework is built for rigorous Responsible AI analysis inspired by real-world industry evaluation pipelines.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # FairEval: Human-Aligned Evaluation for Generative Models
2
 
3
+ **Author:** Kriti Behl
4
+ **GitHub:** https://github.com/kritibehl/FairEval
5
+ **Paper (preprint):** _“FairEval: Human-Aligned Evaluation for Generative Models”_
6
 
7
+ FairEval is a lightweight research framework for evaluating LLM outputs beyond accuracy — focusing on:
8
 
9
+ - **LLM-as-Judge alignment scoring**
10
+ - **Toxicity / safety analysis**
11
+ - **Human agreement metrics (κ, ρ)**
12
+ - **Group-wise fairness dashboards**
13
 
14
+ It is designed as a **research tool**, not a deployment model.
15
 
16
+ ---
17
 
18
+ ## What this repo contains
19
+
20
+ This Hugging Face repo currently serves as a **model card + metadata hub** for:
21
+
22
+ - The **FairEval evaluation pipeline** (code on GitHub)
23
+ - A planned **Hugging Face Space demo** (UI built in Streamlit)
24
+ - Links to my **preprint** and **Medium explainer**.
25
+
26
+ > **Code**: https://github.com/kritibehl/FairEval
27
+ > **Medium**: https://medium.com/@kriti0608/faireval-a-human-aligned-evaluation-framework-for-generative-models-d822bfd5c99d
28
+
29
+ ---
30
+
31
+ ## Capabilities
32
+
33
+ FairEval supports:
34
+
35
+ 1. **Rubric-based LLM-as-Judge scoring**
36
+ - Uses a structured rubric (`config/prompts/judge_rubric.md`) to score:
37
+ - coherence
38
+ - helpfulness
39
+ - factuality
40
+ - Returns **scalar scores** that correlate with human preference.
41
+
42
+ 2. **Toxicity and safety metrics**
43
+ - Wraps a toxicity model (e.g., Detoxify) to compute:
44
+ - composite toxicity
45
+ - per-category scores (insult, threat, identity attack, etc.)
46
+ - Provides **Altair charts** for:
47
+ - toxicity breakdown by category
48
+ - toxicity distribution by demographic group
49
+
50
+ 3. **Human evaluation agreement**
51
+ - Ingests a `human_eval.csv` file with human ratings.
52
+ - Computes:
53
+ - **Fleiss’ κ** (inter-rater reliability)
54
+ - **Spearman ρ** between judge and human scores.
55
+
56
+ ---
57
+
58
+ ## Example Usage
59
+
60
+ Checkout the GitHub repo and run the Streamlit demo:
61
+
62
+ ```bash
63
+ git clone https://github.com/kritibehl/FairEval.git
64
+ cd FairEval
65
+ python3 -m venv .venv && source .venv/bin/activate
66
+ pip install -r requirements.txt
67
+ streamlit run demo/app.py