jonmabe
/

privacy-classifier-electra

+---
+license: apache-2.0
+language:
+- en
+pipeline_tag: text-classification
+tags:
+- privacy
+- content-moderation
+- classifier
+- electra
+datasets:
+- custom
+metrics:
+- accuracy
+model-index:
+- name: privacy-classifier-electra
+  results:
+  - task:
+      type: text-classification
+      name: Privacy Classification
+    metrics:
+    - type: accuracy
+      value: 0.9968
+      name: Validation Accuracy
+widget:
+- text: "My social security number is 123-45-6789"
+  example_title: "Sensitive (SSN)"
+- text: "The weather is nice today"
+  example_title: "Safe"
+- text: "My password is hunter2"
+  example_title: "Sensitive (Password)"
+- text: "I like pizza"
+  example_title: "Safe"
+---
+# Privacy Classifier (ELECTRA)
+A fine-tuned ELECTRA model for detecting sensitive/private information in text.
+## Model Description
+This model classifies text as either **safe** or **sensitive**, helping identify content that may contain private information like:
+- Social security numbers
+- Passwords and credentials
+- Financial account numbers
+- Personal health information
+- Home addresses
+- Phone numbers
+### Base Model
+- **Architecture**: [google/electra-base-discriminator](https://huggingface.co/google/electra-base-discriminator)
+- **Parameters**: ~110M
+- **Task**: Binary text classification
+## Training Details
+| Parameter | Value |
+|-----------|-------|
+| Epochs | 5 |
+| Validation Accuracy | **99.68%** |
+| Training Hardware | NVIDIA RTX 5090 (32GB) |
+| Framework | PyTorch + Transformers |
+### Labels
+- `safe` (0): Content does not contain sensitive information
+- `sensitive` (1): Content may contain private/sensitive information
+## Usage
+```python
+from transformers import pipeline
+classifier = pipeline("text-classification", model="jonmabe/privacy-classifier-electra")
+# Examples
+result = classifier("My SSN is 123-45-6789")
+# [{'label': 'sensitive', 'score': 0.99...}]
+result = classifier("The meeting is at 3pm")
+# [{'label': 'safe', 'score': 0.99...}]
+```
+### Direct Usage
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+tokenizer = AutoTokenizer.from_pretrained("jonmabe/privacy-classifier-electra")
+model = AutoModelForSequenceClassification.from_pretrained("jonmabe/privacy-classifier-electra")
+text = "My credit card number is 4111-1111-1111-1111"
+inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
+with torch.no_grad():
+    outputs = model(**inputs)
+    prediction = torch.argmax(outputs.logits, dim=-1)
+    label = "sensitive" if prediction.item() == 1 else "safe"
+    print(f"Classification: {label}")
+```
+## Intended Use
+- **Primary Use**: Pre-screening text before logging, storage, or transmission
+- **Use Cases**:
+  - Filtering sensitive content from logs
+  - Flagging potential PII in user-generated content
+  - Privacy-aware content moderation
+  - Data loss prevention (DLP) systems
+## Limitations
+- Trained primarily on English text
+- May not catch all forms of sensitive information
+- Should be used as one layer in a defense-in-depth approach
+- Not a substitute for proper data handling policies
+## Training Data
+Custom dataset combining:
+- Synthetic examples of sensitive patterns (SSN, passwords, etc.)
+- Safe text samples from various domains
+- Balanced classes for robust classification
+## Citation
+```bibtex
+@misc{privacy-classifier-electra,
+  author = {jonmabe},
+  title = {Privacy Classifier based on ELECTRA},
+  year = {2026},
+  publisher = {Hugging Face},
+  url = {https://huggingface.co/jonmabe/privacy-classifier-electra}
+}
+```