---
license: apache-2.0
language:
- en
pipeline_tag: text-classification
tags:
- privacy
- content-moderation
- classifier
- electra
datasets:
- custom
metrics:
- accuracy
model-index:
- name: privacy-classifier-electra
  results:
  - task:
      type: text-classification
      name: Privacy Classification
    metrics:
    - type: accuracy
      value: 0.9968
      name: Validation Accuracy
widget:
- text: "My social security number is 123-45-6789"
  example_title: "Sensitive (SSN)"
- text: "The weather is nice today"
  example_title: "Safe"
- text: "My password is hunter2"
  example_title: "Sensitive (Password)"
- text: "I like pizza"
  example_title: "Safe"
---

# Privacy Classifier (ELECTRA)

A fine-tuned ELECTRA model for detecting sensitive/private information in text.

## Model Description

This model classifies text as either **safe** or **sensitive**, helping identify content that may contain private information like:
- Social security numbers
- Passwords and credentials
- Financial account numbers
- Personal health information
- Home addresses
- Phone numbers

### Base Model
- **Architecture**: [google/electra-base-discriminator](https://huggingface.co/google/electra-base-discriminator)
- **Parameters**: ~110M
- **Task**: Binary text classification

## Training Details

| Parameter | Value |
|-----------|-------|
| Epochs | 5 |
| Validation Accuracy | **99.68%** |
| Training Hardware | NVIDIA RTX 5090 (32GB) |
| Framework | PyTorch + Transformers |

### Labels
- `safe` (0): Content does not contain sensitive information
- `sensitive` (1): Content may contain private/sensitive information

## Usage

```python
from transformers import pipeline

classifier = pipeline("text-classification", model="jonmabe/privacy-classifier-electra")

# Examples
result = classifier("My SSN is 123-45-6789")
# [{'label': 'sensitive', 'score': 0.99...}]

result = classifier("The meeting is at 3pm")
# [{'label': 'safe', 'score': 0.99...}]
```

### Direct Usage

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("jonmabe/privacy-classifier-electra")
model = AutoModelForSequenceClassification.from_pretrained("jonmabe/privacy-classifier-electra")

text = "My credit card number is 4111-1111-1111-1111"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

with torch.no_grad():
    outputs = model(**inputs)
    prediction = torch.argmax(outputs.logits, dim=-1)
    label = "sensitive" if prediction.item() == 1 else "safe"
    print(f"Classification: {label}")
```

## Intended Use

- **Primary Use**: Pre-screening text before logging, storage, or transmission
- **Use Cases**:
  - Filtering sensitive content from logs
  - Flagging potential PII in user-generated content
  - Privacy-aware content moderation
  - Data loss prevention (DLP) systems

## Limitations

- Trained primarily on English text
- May not catch all forms of sensitive information
- Should be used as one layer in a defense-in-depth approach
- Not a substitute for proper data handling policies

## Training Data

Custom dataset combining:
- Synthetic examples of sensitive patterns (SSN, passwords, etc.)
- Safe text samples from various domains
- Balanced classes for robust classification

## Citation

```bibtex
@misc{privacy-classifier-electra,
  author = {jonmabe},
  title = {Privacy Classifier based on ELECTRA},
  year = {2026},
  publisher = {Hugging Face},
  url = {https://huggingface.co/jonmabe/privacy-classifier-electra}
}
```