File size: 3,926 Bytes
894869d
3b26c91
894869d
 
 
 
 
 
 
 
 
 
 
32e8dbc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
894869d
32e8dbc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
894869d
32e8dbc
894869d
32e8dbc
894869d
32e8dbc
894869d
32e8dbc
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
---
title: LLM PII Detection Leaderboard
emoji: πŸ₯‡
colorFrom: green
colorTo: indigo
sdk: gradio
app_file: app.py
pinned: true
license: apache-2.0
short_description: Duplicate this leaderboard to initialize your own!
sdk_version: 5.19.0
---

# πŸ”’ LLM PII Detection Leaderboard

A comprehensive benchmark for evaluating language models' performance in detecting and handling personally identifiable information (PII) across various document types and scenarios.

## ✨ Features

- **Beautiful Modern UI**: Elegant dark theme with gradient styling and smooth animations
- **Comprehensive Metrics**: Precision, Recall, F1 Score, Over-detection Rate, Processing Time, and Cost
- **Domain-Specific Analysis**: Specialized evaluation across Healthcare, Financial, Government, Legal, and Personal documents
- **Performance Cards**: Professional model performance cards perfect for presentations and reports
- **Interactive Filtering**: Filter by model type, document type, and sort by any metric
- **Real-time Updates**: Dynamic table updates and score visualizations

## πŸš€ Quick Start

### Installation

```bash
git clone https://github.com/your-username/LLM-PII-Detection-Leaderboard.git
cd LLM-PII-Detection-Leaderboard
pip install -r requirements.txt
```

### Run the Application

```bash
python app.py
```

The leaderboard will be available at `http://localhost:7860`

## πŸ“Š Key Metrics

- **Overall Accuracy**: Percentage of correctly identified and classified PII entities
- **Precision**: Of all flagged items, how many were actually PII (avoiding false positives)
- **Recall**: Of all PII present, how many were successfully detected (avoiding false negatives)
- **F1 Score**: Harmonic mean balancing precision and recall
- **Over-detection Rate**: Percentage of non-PII incorrectly flagged (lower is better)

## πŸ—οΈ Project Structure

```
LLM-PII-Detection-Leaderboard/
β”œβ”€β”€ app.py                 # Main application entry point
β”œβ”€β”€ pii_leaderboard.py     # Core leaderboard functionality
β”œβ”€β”€ data_loader.py         # Data loading and styling configuration
β”œβ”€β”€ requirements.txt       # Python dependencies
└── README.md             # This file
```

## 🎨 Design Philosophy

This leaderboard combines the slim architecture of agent-leaderboard with the beautiful design elements from DocumentProcessing Leaderboard Nutrient, featuring:

- **Minimal Dependencies**: Only essential packages (Gradio, Pandas, NumPy)
- **Clean Architecture**: Simple, maintainable code structure
- **Professional Styling**: Modern dark theme with custom color palette
- **Interactive Elements**: Score bars, rank badges, and performance cards
- **Responsive Design**: Works beautifully on all screen sizes

## πŸ”§ Customization

### Adding New Models

Update the `sample_data` dictionary in `data_loader.py` with your model's performance metrics.

### Changing Colors

Modify the `COLORS` dictionary in `data_loader.py` to customize the color scheme.

### Adding New Metrics

1. Add the metric to your data structure
2. Update the table generation in `pii_leaderboard.py`
3. Add appropriate styling and score bars

## πŸ“ˆ Performance

The leaderboard currently evaluates 8 leading language models across:
- **5 Document Types**: Healthcare, Financial, Government, Legal, Personal
- **6 Key Metrics**: Accuracy, Precision, Recall, F1, Over-detection Rate, Cost & Time
- **Real-world Scenarios**: Synthetic industry documents with embedded PII

## 🀝 Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Test thoroughly
5. Submit a pull request

## πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

## πŸ™ Acknowledgments

- Inspired by the elegant design of DocumentProcessing Leaderboard Nutrient
- Built with the slim architecture approach of agent-leaderboard
- Powered by Gradio for the beautiful web interface