File size: 3,926 Bytes
894869d 3b26c91 894869d 32e8dbc 894869d 32e8dbc 894869d 32e8dbc 894869d 32e8dbc 894869d 32e8dbc 894869d 32e8dbc |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 |
---
title: LLM PII Detection Leaderboard
emoji: π₯
colorFrom: green
colorTo: indigo
sdk: gradio
app_file: app.py
pinned: true
license: apache-2.0
short_description: Duplicate this leaderboard to initialize your own!
sdk_version: 5.19.0
---
# π LLM PII Detection Leaderboard
A comprehensive benchmark for evaluating language models' performance in detecting and handling personally identifiable information (PII) across various document types and scenarios.
## β¨ Features
- **Beautiful Modern UI**: Elegant dark theme with gradient styling and smooth animations
- **Comprehensive Metrics**: Precision, Recall, F1 Score, Over-detection Rate, Processing Time, and Cost
- **Domain-Specific Analysis**: Specialized evaluation across Healthcare, Financial, Government, Legal, and Personal documents
- **Performance Cards**: Professional model performance cards perfect for presentations and reports
- **Interactive Filtering**: Filter by model type, document type, and sort by any metric
- **Real-time Updates**: Dynamic table updates and score visualizations
## π Quick Start
### Installation
```bash
git clone https://github.com/your-username/LLM-PII-Detection-Leaderboard.git
cd LLM-PII-Detection-Leaderboard
pip install -r requirements.txt
```
### Run the Application
```bash
python app.py
```
The leaderboard will be available at `http://localhost:7860`
## π Key Metrics
- **Overall Accuracy**: Percentage of correctly identified and classified PII entities
- **Precision**: Of all flagged items, how many were actually PII (avoiding false positives)
- **Recall**: Of all PII present, how many were successfully detected (avoiding false negatives)
- **F1 Score**: Harmonic mean balancing precision and recall
- **Over-detection Rate**: Percentage of non-PII incorrectly flagged (lower is better)
## ποΈ Project Structure
```
LLM-PII-Detection-Leaderboard/
βββ app.py # Main application entry point
βββ pii_leaderboard.py # Core leaderboard functionality
βββ data_loader.py # Data loading and styling configuration
βββ requirements.txt # Python dependencies
βββ README.md # This file
```
## π¨ Design Philosophy
This leaderboard combines the slim architecture of agent-leaderboard with the beautiful design elements from DocumentProcessing Leaderboard Nutrient, featuring:
- **Minimal Dependencies**: Only essential packages (Gradio, Pandas, NumPy)
- **Clean Architecture**: Simple, maintainable code structure
- **Professional Styling**: Modern dark theme with custom color palette
- **Interactive Elements**: Score bars, rank badges, and performance cards
- **Responsive Design**: Works beautifully on all screen sizes
## π§ Customization
### Adding New Models
Update the `sample_data` dictionary in `data_loader.py` with your model's performance metrics.
### Changing Colors
Modify the `COLORS` dictionary in `data_loader.py` to customize the color scheme.
### Adding New Metrics
1. Add the metric to your data structure
2. Update the table generation in `pii_leaderboard.py`
3. Add appropriate styling and score bars
## π Performance
The leaderboard currently evaluates 8 leading language models across:
- **5 Document Types**: Healthcare, Financial, Government, Legal, Personal
- **6 Key Metrics**: Accuracy, Precision, Recall, F1, Over-detection Rate, Cost & Time
- **Real-world Scenarios**: Synthetic industry documents with embedded PII
## π€ Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Test thoroughly
5. Submit a pull request
## π License
This project is licensed under the MIT License - see the LICENSE file for details.
## π Acknowledgments
- Inspired by the elegant design of DocumentProcessing Leaderboard Nutrient
- Built with the slim architecture approach of agent-leaderboard
- Powered by Gradio for the beautiful web interface |