|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
pipeline_tag: text-classification |
|
|
tags: |
|
|
- privacy |
|
|
- content-moderation |
|
|
- classifier |
|
|
- electra |
|
|
datasets: |
|
|
- custom |
|
|
metrics: |
|
|
- accuracy |
|
|
model-index: |
|
|
- name: privacy-classifier-electra |
|
|
results: |
|
|
- task: |
|
|
type: text-classification |
|
|
name: Privacy Classification |
|
|
metrics: |
|
|
- type: accuracy |
|
|
value: 0.9968 |
|
|
name: Validation Accuracy |
|
|
widget: |
|
|
- text: "My social security number is 123-45-6789" |
|
|
example_title: "Sensitive (SSN)" |
|
|
- text: "The weather is nice today" |
|
|
example_title: "Safe" |
|
|
- text: "My password is hunter2" |
|
|
example_title: "Sensitive (Password)" |
|
|
- text: "I like pizza" |
|
|
example_title: "Safe" |
|
|
--- |
|
|
|
|
|
# Privacy Classifier (ELECTRA) |
|
|
|
|
|
A fine-tuned ELECTRA model for detecting sensitive/private information in text. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
This model classifies text as either **safe** or **sensitive**, helping identify content that may contain private information like: |
|
|
- Social security numbers |
|
|
- Passwords and credentials |
|
|
- Financial account numbers |
|
|
- Personal health information |
|
|
- Home addresses |
|
|
- Phone numbers |
|
|
|
|
|
### Base Model |
|
|
- **Architecture**: [google/electra-base-discriminator](https://huggingface.co/google/electra-base-discriminator) |
|
|
- **Parameters**: ~110M |
|
|
- **Task**: Binary text classification |
|
|
|
|
|
## Training Details |
|
|
|
|
|
| Parameter | Value | |
|
|
|-----------|-------| |
|
|
| Epochs | 5 | |
|
|
| Validation Accuracy | **99.68%** | |
|
|
| Training Hardware | NVIDIA RTX 5090 (32GB) | |
|
|
| Framework | PyTorch + Transformers | |
|
|
|
|
|
### Labels |
|
|
- `safe` (0): Content does not contain sensitive information |
|
|
- `sensitive` (1): Content may contain private/sensitive information |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import pipeline |
|
|
|
|
|
classifier = pipeline("text-classification", model="jonmabe/privacy-classifier-electra") |
|
|
|
|
|
# Examples |
|
|
result = classifier("My SSN is 123-45-6789") |
|
|
# [{'label': 'sensitive', 'score': 0.99...}] |
|
|
|
|
|
result = classifier("The meeting is at 3pm") |
|
|
# [{'label': 'safe', 'score': 0.99...}] |
|
|
``` |
|
|
|
|
|
### Direct Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
import torch |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained("jonmabe/privacy-classifier-electra") |
|
|
model = AutoModelForSequenceClassification.from_pretrained("jonmabe/privacy-classifier-electra") |
|
|
|
|
|
text = "My credit card number is 4111-1111-1111-1111" |
|
|
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512) |
|
|
|
|
|
with torch.no_grad(): |
|
|
outputs = model(**inputs) |
|
|
prediction = torch.argmax(outputs.logits, dim=-1) |
|
|
label = "sensitive" if prediction.item() == 1 else "safe" |
|
|
print(f"Classification: {label}") |
|
|
``` |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
- **Primary Use**: Pre-screening text before logging, storage, or transmission |
|
|
- **Use Cases**: |
|
|
- Filtering sensitive content from logs |
|
|
- Flagging potential PII in user-generated content |
|
|
- Privacy-aware content moderation |
|
|
- Data loss prevention (DLP) systems |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Trained primarily on English text |
|
|
- May not catch all forms of sensitive information |
|
|
- Should be used as one layer in a defense-in-depth approach |
|
|
- Not a substitute for proper data handling policies |
|
|
|
|
|
## Training Data |
|
|
|
|
|
Custom dataset combining: |
|
|
- Synthetic examples of sensitive patterns (SSN, passwords, etc.) |
|
|
- Safe text samples from various domains |
|
|
- Balanced classes for robust classification |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@misc{privacy-classifier-electra, |
|
|
author = {jonmabe}, |
|
|
title = {Privacy Classifier based on ELECTRA}, |
|
|
year = {2026}, |
|
|
publisher = {Hugging Face}, |
|
|
url = {https://huggingface.co/jonmabe/privacy-classifier-electra} |
|
|
} |
|
|
``` |
|
|
|