Minibase commited on
Commit
b09292f
Β·
verified Β·
1 Parent(s): f7e2dff

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +25 -11
README.md CHANGED
@@ -37,9 +37,9 @@ model-index:
37
  - type: pii-detection-rate
38
  value: 1.000
39
  name: PII Detection Rate
40
- - type: completeness-score
41
  value: 0.650
42
- name: Completeness Score
43
  - type: semantic-preservation
44
  value: 0.811
45
  name: Semantic Preservation
@@ -71,9 +71,10 @@ model-index:
71
 
72
  ### Key Features
73
  - πŸ”’ **Privacy-First**: Removes personal identifiers automatically
74
- - 🎯 **High Completeness**: 64% of texts fully de-identified
 
75
  - πŸ“ **Compact Size**: 136MB (Q8_0 quantized)
76
- - ⚑ **Fast Inference**: ~492ms average response time
77
  - 🌐 **Multi-Domain**: Works across medical, legal, HR, and general text
78
  - πŸ”„ **Local Processing**: No data sent to external servers
79
 
@@ -191,17 +192,30 @@ print(result)
191
 
192
  | Metric | Score | Description |
193
  |--------|-------|-------------|
194
- | **PII Detection Rate** | **100%** | **Perfect detection when PII is present in input** |
195
- | **Completeness Score** | **65.0%** | **Percentage of texts fully de-identified** |
196
  | **Semantic Preservation** | **81.1%** | **How well original meaning is preserved** |
197
  | **Average Latency** | **477ms** | **Response time performance** |
198
 
 
 
 
 
 
 
 
 
 
 
 
 
199
  ### Performance Insights
200
 
201
- - βœ… **Perfect PII Detection**: 100% detection rate when PII is present in input
202
- - βœ… **Strong Completeness**: 67% of texts fully de-identified
203
- - βœ… **Fast Inference**: 484ms average response time
204
- - βœ… **Unified Performance**: Consistent across all text types and domains
 
205
 
206
  ## πŸ—οΈ Technical Details
207
 
@@ -431,7 +445,7 @@ If you use DeId-Small in your research, please cite:
431
 
432
  - **Website**: [minibase.ai](https://minibase.ai)
433
  - **Discord**: [Join our community](https://discord.com/invite/BrJn4D2Guh)
434
- - **Documentation**: [docs.minibase.ai](https://docs.minibase.ai)
435
 
436
  ## πŸ“‹ License
437
 
 
37
  - type: pii-detection-rate
38
  value: 1.000
39
  name: PII Detection Rate
40
+ - type: pii-removal-completeness
41
  value: 0.650
42
+ name: PII Removal Completeness
43
  - type: semantic-preservation
44
  value: 0.811
45
  name: Semantic Preservation
 
71
 
72
  ### Key Features
73
  - πŸ”’ **Privacy-First**: Removes personal identifiers automatically
74
+ - 🎯 **Perfect PII Detection**: 100% detection rate when PII is present
75
+ - βœ… **Strong PII Removal**: 65% of texts completely de-identified
76
  - πŸ“ **Compact Size**: 136MB (Q8_0 quantized)
77
+ - ⚑ **Fast Inference**: 477ms average response time
78
  - 🌐 **Multi-Domain**: Works across medical, legal, HR, and general text
79
  - πŸ”„ **Local Processing**: No data sent to external servers
80
 
 
192
 
193
  | Metric | Score | Description |
194
  |--------|-------|-------------|
195
+ | **PII Detection Rate** | **100%** | **Model responds to PII presence with placeholders** |
196
+ | **PII Removal Completeness** | **65%** | **Successfully removes all detectable PII from output** |
197
  | **Semantic Preservation** | **81.1%** | **How well original meaning is preserved** |
198
  | **Average Latency** | **477ms** | **Response time performance** |
199
 
200
+ ### Understanding the Metrics
201
+
202
+ **PII Detection Rate (100%)**: Measures whether the model recognizes when personal information is present in the input text and responds by generating placeholders. This is a measure of the model's sensitivity to PII presence.
203
+
204
+ **PII Removal Completeness (65%)**: Measures whether the model successfully removes ALL detectable personal identifiers from the output text. This is a strict measure - even one remaining PII element (like a name, date, or phone number) counts as incomplete.
205
+
206
+ **Why 65% is Strong Performance**: Achieving 100% completeness is extremely challenging because:
207
+ - PII can be contextually important (e.g., "Dr. Smith" in medical records)
208
+ - Some PII might be embedded in complex ways
209
+ - Perfect removal could harm text coherence or meaning
210
+ - 65% completeness means the model reliably sanitizes most texts while preserving utility
211
+
212
  ### Performance Insights
213
 
214
+ - βœ… **Perfect PII Detection**: 100% of texts with PII trigger placeholder generation
215
+ - βœ… **Strong PII Removal**: 65% of outputs are completely free of detectable PII
216
+ - βœ… **Excellent Semantic Preservation**: 81.1% meaning retention during de-identification
217
+ - βœ… **Fast Inference**: 477ms average response time
218
+ - βœ… **Unified Performance**: Consistent across medical, legal, HR, and general text
219
 
220
  ## πŸ—οΈ Technical Details
221
 
 
445
 
446
  - **Website**: [minibase.ai](https://minibase.ai)
447
  - **Discord**: [Join our community](https://discord.com/invite/BrJn4D2Guh)
448
+ - **Documentation**: [docs.minibase.ai](https://help.minibase.ai)
449
 
450
  ## πŸ“‹ License
451