jonmabe commited on
Commit
5d5556e
·
verified ·
1 Parent(s): 86015af

Update model card with documentation and examples

Browse files
Files changed (1) hide show
  1. README.md +135 -0
README.md ADDED
@@ -0,0 +1,135 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ pipeline_tag: text-classification
6
+ tags:
7
+ - privacy
8
+ - content-moderation
9
+ - classifier
10
+ - electra
11
+ datasets:
12
+ - custom
13
+ metrics:
14
+ - accuracy
15
+ model-index:
16
+ - name: privacy-classifier-electra
17
+ results:
18
+ - task:
19
+ type: text-classification
20
+ name: Privacy Classification
21
+ metrics:
22
+ - type: accuracy
23
+ value: 0.9968
24
+ name: Validation Accuracy
25
+ widget:
26
+ - text: "My social security number is 123-45-6789"
27
+ example_title: "Sensitive (SSN)"
28
+ - text: "The weather is nice today"
29
+ example_title: "Safe"
30
+ - text: "My password is hunter2"
31
+ example_title: "Sensitive (Password)"
32
+ - text: "I like pizza"
33
+ example_title: "Safe"
34
+ ---
35
+
36
+ # Privacy Classifier (ELECTRA)
37
+
38
+ A fine-tuned ELECTRA model for detecting sensitive/private information in text.
39
+
40
+ ## Model Description
41
+
42
+ This model classifies text as either **safe** or **sensitive**, helping identify content that may contain private information like:
43
+ - Social security numbers
44
+ - Passwords and credentials
45
+ - Financial account numbers
46
+ - Personal health information
47
+ - Home addresses
48
+ - Phone numbers
49
+
50
+ ### Base Model
51
+ - **Architecture**: [google/electra-base-discriminator](https://huggingface.co/google/electra-base-discriminator)
52
+ - **Parameters**: ~110M
53
+ - **Task**: Binary text classification
54
+
55
+ ## Training Details
56
+
57
+ | Parameter | Value |
58
+ |-----------|-------|
59
+ | Epochs | 5 |
60
+ | Validation Accuracy | **99.68%** |
61
+ | Training Hardware | NVIDIA RTX 5090 (32GB) |
62
+ | Framework | PyTorch + Transformers |
63
+
64
+ ### Labels
65
+ - `safe` (0): Content does not contain sensitive information
66
+ - `sensitive` (1): Content may contain private/sensitive information
67
+
68
+ ## Usage
69
+
70
+ ```python
71
+ from transformers import pipeline
72
+
73
+ classifier = pipeline("text-classification", model="jonmabe/privacy-classifier-electra")
74
+
75
+ # Examples
76
+ result = classifier("My SSN is 123-45-6789")
77
+ # [{'label': 'sensitive', 'score': 0.99...}]
78
+
79
+ result = classifier("The meeting is at 3pm")
80
+ # [{'label': 'safe', 'score': 0.99...}]
81
+ ```
82
+
83
+ ### Direct Usage
84
+
85
+ ```python
86
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
87
+ import torch
88
+
89
+ tokenizer = AutoTokenizer.from_pretrained("jonmabe/privacy-classifier-electra")
90
+ model = AutoModelForSequenceClassification.from_pretrained("jonmabe/privacy-classifier-electra")
91
+
92
+ text = "My credit card number is 4111-1111-1111-1111"
93
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
94
+
95
+ with torch.no_grad():
96
+ outputs = model(**inputs)
97
+ prediction = torch.argmax(outputs.logits, dim=-1)
98
+ label = "sensitive" if prediction.item() == 1 else "safe"
99
+ print(f"Classification: {label}")
100
+ ```
101
+
102
+ ## Intended Use
103
+
104
+ - **Primary Use**: Pre-screening text before logging, storage, or transmission
105
+ - **Use Cases**:
106
+ - Filtering sensitive content from logs
107
+ - Flagging potential PII in user-generated content
108
+ - Privacy-aware content moderation
109
+ - Data loss prevention (DLP) systems
110
+
111
+ ## Limitations
112
+
113
+ - Trained primarily on English text
114
+ - May not catch all forms of sensitive information
115
+ - Should be used as one layer in a defense-in-depth approach
116
+ - Not a substitute for proper data handling policies
117
+
118
+ ## Training Data
119
+
120
+ Custom dataset combining:
121
+ - Synthetic examples of sensitive patterns (SSN, passwords, etc.)
122
+ - Safe text samples from various domains
123
+ - Balanced classes for robust classification
124
+
125
+ ## Citation
126
+
127
+ ```bibtex
128
+ @misc{privacy-classifier-electra,
129
+ author = {jonmabe},
130
+ title = {Privacy Classifier based on ELECTRA},
131
+ year = {2026},
132
+ publisher = {Hugging Face},
133
+ url = {https://huggingface.co/jonmabe/privacy-classifier-electra}
134
+ }
135
+ ```