Fix label count (76 not 109) and entity count consistency
Browse files
README.md
CHANGED
|
@@ -63,7 +63,7 @@ widget:
|
|
| 63 |
|
| 64 |
- **Dutch-Optimized**: Specifically trained on Dutch text for optimal performance
|
| 65 |
- **High Accuracy**: Achieves strong F1 scores across diverse PII categories
|
| 66 |
-
- **Comprehensive Coverage**: Detects
|
| 67 |
- **Privacy-Focused**: Designed for de-identification and compliance with GDPR and other privacy regulations
|
| 68 |
- **Production-Ready**: Optimized for real-world text processing pipelines
|
| 69 |
|
|
@@ -263,7 +263,7 @@ with torch.no_grad():
|
|
| 263 |
|
| 264 |
- **Source**: [AI4Privacy PII Masking 400k](https://huggingface.co/datasets/ai4privacy/pii-masking-400k) (Dutch subset)
|
| 265 |
- **Format**: BIO-tagged token classification
|
| 266 |
-
- **Labels**:
|
| 267 |
|
| 268 |
### Training Configuration
|
| 269 |
|
|
|
|
| 63 |
|
| 64 |
- **Dutch-Optimized**: Specifically trained on Dutch text for optimal performance
|
| 65 |
- **High Accuracy**: Achieves strong F1 scores across diverse PII categories
|
| 66 |
+
- **Comprehensive Coverage**: Detects 54 entity types spanning personal, financial, medical, and contact information
|
| 67 |
- **Privacy-Focused**: Designed for de-identification and compliance with GDPR and other privacy regulations
|
| 68 |
- **Production-Ready**: Optimized for real-world text processing pipelines
|
| 69 |
|
|
|
|
| 263 |
|
| 264 |
- **Source**: [AI4Privacy PII Masking 400k](https://huggingface.co/datasets/ai4privacy/pii-masking-400k) (Dutch subset)
|
| 265 |
- **Format**: BIO-tagged token classification
|
| 266 |
+
- **Labels**: 76 total (54 B-tags + 21 I-tags + O)
|
| 267 |
|
| 268 |
### Training Configuration
|
| 269 |
|