jonmabe's picture
Update model card with documentation and examples
5d5556e verified
---
license: apache-2.0
language:
- en
pipeline_tag: text-classification
tags:
- privacy
- content-moderation
- classifier
- electra
datasets:
- custom
metrics:
- accuracy
model-index:
- name: privacy-classifier-electra
results:
- task:
type: text-classification
name: Privacy Classification
metrics:
- type: accuracy
value: 0.9968
name: Validation Accuracy
widget:
- text: "My social security number is 123-45-6789"
example_title: "Sensitive (SSN)"
- text: "The weather is nice today"
example_title: "Safe"
- text: "My password is hunter2"
example_title: "Sensitive (Password)"
- text: "I like pizza"
example_title: "Safe"
---
# Privacy Classifier (ELECTRA)
A fine-tuned ELECTRA model for detecting sensitive/private information in text.
## Model Description
This model classifies text as either **safe** or **sensitive**, helping identify content that may contain private information like:
- Social security numbers
- Passwords and credentials
- Financial account numbers
- Personal health information
- Home addresses
- Phone numbers
### Base Model
- **Architecture**: [google/electra-base-discriminator](https://huggingface.co/google/electra-base-discriminator)
- **Parameters**: ~110M
- **Task**: Binary text classification
## Training Details
| Parameter | Value |
|-----------|-------|
| Epochs | 5 |
| Validation Accuracy | **99.68%** |
| Training Hardware | NVIDIA RTX 5090 (32GB) |
| Framework | PyTorch + Transformers |
### Labels
- `safe` (0): Content does not contain sensitive information
- `sensitive` (1): Content may contain private/sensitive information
## Usage
```python
from transformers import pipeline
classifier = pipeline("text-classification", model="jonmabe/privacy-classifier-electra")
# Examples
result = classifier("My SSN is 123-45-6789")
# [{'label': 'sensitive', 'score': 0.99...}]
result = classifier("The meeting is at 3pm")
# [{'label': 'safe', 'score': 0.99...}]
```
### Direct Usage
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("jonmabe/privacy-classifier-electra")
model = AutoModelForSequenceClassification.from_pretrained("jonmabe/privacy-classifier-electra")
text = "My credit card number is 4111-1111-1111-1111"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
outputs = model(**inputs)
prediction = torch.argmax(outputs.logits, dim=-1)
label = "sensitive" if prediction.item() == 1 else "safe"
print(f"Classification: {label}")
```
## Intended Use
- **Primary Use**: Pre-screening text before logging, storage, or transmission
- **Use Cases**:
- Filtering sensitive content from logs
- Flagging potential PII in user-generated content
- Privacy-aware content moderation
- Data loss prevention (DLP) systems
## Limitations
- Trained primarily on English text
- May not catch all forms of sensitive information
- Should be used as one layer in a defense-in-depth approach
- Not a substitute for proper data handling policies
## Training Data
Custom dataset combining:
- Synthetic examples of sensitive patterns (SSN, passwords, etc.)
- Safe text samples from various domains
- Balanced classes for robust classification
## Citation
```bibtex
@misc{privacy-classifier-electra,
author = {jonmabe},
title = {Privacy Classifier based on ELECTRA},
year = {2026},
publisher = {Hugging Face},
url = {https://huggingface.co/jonmabe/privacy-classifier-electra}
}
```