|
|
--- |
|
|
title: LLM PII Detection Leaderboard |
|
|
emoji: π₯ |
|
|
colorFrom: green |
|
|
colorTo: indigo |
|
|
sdk: gradio |
|
|
app_file: app.py |
|
|
pinned: true |
|
|
license: apache-2.0 |
|
|
short_description: Duplicate this leaderboard to initialize your own! |
|
|
sdk_version: 5.19.0 |
|
|
--- |
|
|
|
|
|
# π LLM PII Detection Leaderboard |
|
|
|
|
|
A comprehensive benchmark for evaluating language models' performance in detecting and handling personally identifiable information (PII) across various document types and scenarios. |
|
|
|
|
|
## β¨ Features |
|
|
|
|
|
- **Beautiful Modern UI**: Elegant dark theme with gradient styling and smooth animations |
|
|
- **Comprehensive Metrics**: Precision, Recall, F1 Score, Over-detection Rate, Processing Time, and Cost |
|
|
- **Domain-Specific Analysis**: Specialized evaluation across Healthcare, Financial, Government, Legal, and Personal documents |
|
|
- **Performance Cards**: Professional model performance cards perfect for presentations and reports |
|
|
- **Interactive Filtering**: Filter by model type, document type, and sort by any metric |
|
|
- **Real-time Updates**: Dynamic table updates and score visualizations |
|
|
|
|
|
## π Quick Start |
|
|
|
|
|
### Installation |
|
|
|
|
|
```bash |
|
|
git clone https://github.com/your-username/LLM-PII-Detection-Leaderboard.git |
|
|
cd LLM-PII-Detection-Leaderboard |
|
|
pip install -r requirements.txt |
|
|
``` |
|
|
|
|
|
### Run the Application |
|
|
|
|
|
```bash |
|
|
python app.py |
|
|
``` |
|
|
|
|
|
The leaderboard will be available at `http://localhost:7860` |
|
|
|
|
|
## π Key Metrics |
|
|
|
|
|
- **Overall Accuracy**: Percentage of correctly identified and classified PII entities |
|
|
- **Precision**: Of all flagged items, how many were actually PII (avoiding false positives) |
|
|
- **Recall**: Of all PII present, how many were successfully detected (avoiding false negatives) |
|
|
- **F1 Score**: Harmonic mean balancing precision and recall |
|
|
- **Over-detection Rate**: Percentage of non-PII incorrectly flagged (lower is better) |
|
|
|
|
|
## ποΈ Project Structure |
|
|
|
|
|
``` |
|
|
LLM-PII-Detection-Leaderboard/ |
|
|
βββ app.py # Main application entry point |
|
|
βββ pii_leaderboard.py # Core leaderboard functionality |
|
|
βββ data_loader.py # Data loading and styling configuration |
|
|
βββ requirements.txt # Python dependencies |
|
|
βββ README.md # This file |
|
|
``` |
|
|
|
|
|
## π¨ Design Philosophy |
|
|
|
|
|
This leaderboard combines the slim architecture of agent-leaderboard with the beautiful design elements from DocumentProcessing Leaderboard Nutrient, featuring: |
|
|
|
|
|
- **Minimal Dependencies**: Only essential packages (Gradio, Pandas, NumPy) |
|
|
- **Clean Architecture**: Simple, maintainable code structure |
|
|
- **Professional Styling**: Modern dark theme with custom color palette |
|
|
- **Interactive Elements**: Score bars, rank badges, and performance cards |
|
|
- **Responsive Design**: Works beautifully on all screen sizes |
|
|
|
|
|
## π§ Customization |
|
|
|
|
|
### Adding New Models |
|
|
|
|
|
Update the `sample_data` dictionary in `data_loader.py` with your model's performance metrics. |
|
|
|
|
|
### Changing Colors |
|
|
|
|
|
Modify the `COLORS` dictionary in `data_loader.py` to customize the color scheme. |
|
|
|
|
|
### Adding New Metrics |
|
|
|
|
|
1. Add the metric to your data structure |
|
|
2. Update the table generation in `pii_leaderboard.py` |
|
|
3. Add appropriate styling and score bars |
|
|
|
|
|
## π Performance |
|
|
|
|
|
The leaderboard currently evaluates 8 leading language models across: |
|
|
- **5 Document Types**: Healthcare, Financial, Government, Legal, Personal |
|
|
- **6 Key Metrics**: Accuracy, Precision, Recall, F1, Over-detection Rate, Cost & Time |
|
|
- **Real-world Scenarios**: Synthetic industry documents with embedded PII |
|
|
|
|
|
## π€ Contributing |
|
|
|
|
|
1. Fork the repository |
|
|
2. Create a feature branch |
|
|
3. Make your changes |
|
|
4. Test thoroughly |
|
|
5. Submit a pull request |
|
|
|
|
|
## π License |
|
|
|
|
|
This project is licensed under the MIT License - see the LICENSE file for details. |
|
|
|
|
|
## π Acknowledgments |
|
|
|
|
|
- Inspired by the elegant design of DocumentProcessing Leaderboard Nutrient |
|
|
- Built with the slim architecture approach of agent-leaderboard |
|
|
- Powered by Gradio for the beautiful web interface |