--- license: apache-2.0 language: - en pipeline_tag: text-classification tags: - privacy - content-moderation - classifier - electra datasets: - custom metrics: - accuracy model-index: - name: privacy-classifier-electra results: - task: type: text-classification name: Privacy Classification metrics: - type: accuracy value: 0.9968 name: Validation Accuracy widget: - text: "My social security number is 123-45-6789" example_title: "Sensitive (SSN)" - text: "The weather is nice today" example_title: "Safe" - text: "My password is hunter2" example_title: "Sensitive (Password)" - text: "I like pizza" example_title: "Safe" --- # Privacy Classifier (ELECTRA) A fine-tuned ELECTRA model for detecting sensitive/private information in text. ## Model Description This model classifies text as either **safe** or **sensitive**, helping identify content that may contain private information like: - Social security numbers - Passwords and credentials - Financial account numbers - Personal health information - Home addresses - Phone numbers ### Base Model - **Architecture**: [google/electra-base-discriminator](https://huggingface.co/google/electra-base-discriminator) - **Parameters**: ~110M - **Task**: Binary text classification ## Training Details | Parameter | Value | |-----------|-------| | Epochs | 5 | | Validation Accuracy | **99.68%** | | Training Hardware | NVIDIA RTX 5090 (32GB) | | Framework | PyTorch + Transformers | ### Labels - `safe` (0): Content does not contain sensitive information - `sensitive` (1): Content may contain private/sensitive information ## Usage ```python from transformers import pipeline classifier = pipeline("text-classification", model="jonmabe/privacy-classifier-electra") # Examples result = classifier("My SSN is 123-45-6789") # [{'label': 'sensitive', 'score': 0.99...}] result = classifier("The meeting is at 3pm") # [{'label': 'safe', 'score': 0.99...}] ``` ### Direct Usage ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch tokenizer = AutoTokenizer.from_pretrained("jonmabe/privacy-classifier-electra") model = AutoModelForSequenceClassification.from_pretrained("jonmabe/privacy-classifier-electra") text = "My credit card number is 4111-1111-1111-1111" inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512) with torch.no_grad(): outputs = model(**inputs) prediction = torch.argmax(outputs.logits, dim=-1) label = "sensitive" if prediction.item() == 1 else "safe" print(f"Classification: {label}") ``` ## Intended Use - **Primary Use**: Pre-screening text before logging, storage, or transmission - **Use Cases**: - Filtering sensitive content from logs - Flagging potential PII in user-generated content - Privacy-aware content moderation - Data loss prevention (DLP) systems ## Limitations - Trained primarily on English text - May not catch all forms of sensitive information - Should be used as one layer in a defense-in-depth approach - Not a substitute for proper data handling policies ## Training Data Custom dataset combining: - Synthetic examples of sensitive patterns (SSN, passwords, etc.) - Safe text samples from various domains - Balanced classes for robust classification ## Citation ```bibtex @misc{privacy-classifier-electra, author = {jonmabe}, title = {Privacy Classifier based on ELECTRA}, year = {2026}, publisher = {Hugging Face}, url = {https://huggingface.co/jonmabe/privacy-classifier-electra} } ```