|
|
--- |
|
|
title: LLM PII Detection Leaderboard |
|
|
emoji: 🔒 |
|
|
colorFrom: green |
|
|
colorTo: indigo |
|
|
sdk: gradio |
|
|
sdk_version: "5.19.0" |
|
|
app_file: app.py |
|
|
pinned: false |
|
|
license: apache-2.0 |
|
|
short_description: Comprehensive benchmark for PII detection performance |
|
|
--- |
|
|
|
|
|
# LLM PII Detection Leaderboard |
|
|
|
|
|
A comprehensive benchmark for evaluating language models' performance in detecting and handling personally identifiable information (PII) across various document types and scenarios. |
|
|
|
|
|
## Features |
|
|
|
|
|
- Interactive leaderboard with performance metrics |
|
|
- Domain-specific analysis across Healthcare, Financial, Government, Legal, and Personal documents |
|
|
- Model performance comparison and filtering |
|
|
- Professional performance cards for presentations |
|
|
|
|
|
## Usage |
|
|
|
|
|
The leaderboard displays various metrics including: |
|
|
- Overall Accuracy |
|
|
- Precision and Recall |
|
|
- F1 Score |
|
|
- Over-detection Rate |
|
|
- Processing Time and Cost |
|
|
|
|
|
Filter by document type and model access to explore performance across different scenarios. |
|
|
|
|
|
## Contributing |
|
|
|
|
|
Contributions are welcome! Submit your results with a Google Colab notebook demonstrating your approach in the Community section. |