PL-RnD commited on
Commit
dc62406
·
verified ·
1 Parent(s): d3513bd

feat: Added buymeacoffeelink

Browse files
Files changed (1) hide show
  1. README.md +99 -97
README.md CHANGED
@@ -1,97 +1,99 @@
1
- ---
2
- license: apache-2.0
3
- language:
4
- - en
5
- base_model:
6
- - google-bert/bert-large-uncased
7
- metrics:
8
- - accuracy
9
- - f1
10
- - precision
11
- pipeline_tag: text-classification
12
- tags:
13
- - privacy
14
- ---
15
-
16
- # Privacy Moderation Large
17
-
18
- This is a BERT Large model fine-tuned to detect privacy violations in text, such as sharing of personally identifiable information (PII) or sensitive data. It is trained on a dataset of labeled examples of privacy violations and non-violations.
19
-
20
-
21
- ## Performance
22
-
23
- This large model achieves the following performance metrics on a held-out test set:
24
- | Metric | Value |
25
- |------------|---------|
26
- | Accuracy | `0.9584` |
27
- | F1 Score | `0.9569` |
28
- | Precision | `0.9621` |
29
- | Recall | `0.9517` |
30
-
31
- These metrics indicate that the model is effective at identifying privacy violations while minimizing false positives.
32
-
33
- ## Limitations
34
-
35
- - The model was trained on a dataset of nearly 1 million examples in varying topics and styles, but may not generalize to all contexts
36
- - It limited to English text
37
- - This current iteration used a dataset where each example is between 20 and 120 words in length, so performance on much longer texts is untested (e.g. full documents may require chunking)
38
- - The model may not detect all types of privacy violations, especially if they are subtle or context-dependent
39
-
40
- ## How to Use
41
-
42
- You can use this model for text classification tasks related to privacy moderation. Here's an example of how to use it with the Hugging Face Transformers library:
43
-
44
- ```python
45
- from transformers import AutoModelForSequenceClassification, AutoTokenizer
46
- import torch
47
- import numpy as np
48
- import pandas as pd
49
-
50
- # Load the model and tokenizer
51
- model_name = "PL-RnD/privacy-moderation-large"
52
- tokenizer = AutoTokenizer.from_pretrained(model_name)
53
- model = AutoModelForSequenceClassification.from_pretrained(model_name)
54
- # Example text
55
- texts = [
56
- "Here is my credit card number: 1234-5678-9012-3456",
57
- "This is a regular message without sensitive information.",
58
- "For homeowners insurance, select deductibles from $500 to $2,500. Higher deductibles lower premiums.",
59
- "Solidarity: My enrollment includes my kid's braces at $4,000 total—family strained. Push for orthodontic expansions. Email blast to reps starting now.",
60
- ]
61
- # Tokenize the input
62
- inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True)
63
- # Get model predictions
64
- with torch.no_grad():
65
- outputs = model(**inputs)
66
-
67
- logits = outputs.logits
68
- predictions = torch.argmax(logits, dim=-1)
69
- # Convert predictions to labels
70
- labels = ["non-violation", "violation"]
71
- predicted_labels = [labels[pred] for pred in predictions.numpy()]
72
- # Display results
73
- df = pd.DataFrame({"text": texts, "label": predicted_labels})
74
- print(df)
75
- ```
76
-
77
- This will output a DataFrame with the original texts and their predicted labels (either "violation" or "non-violation"). Example output:
78
-
79
- ```
80
- text label
81
- 0 Here is my credit card number: 1234-5678-9012-... violation
82
- 1 This is a regular message without sensitive in... non-violation
83
- 2 For homeowners insurance, select deductibles f... non-violation
84
- 3 Solidarity: My enrollment includes my kid's br... violation
85
- ```
86
-
87
- ## Intended Use
88
- This model is intended to flag privacy concerns that a privacy conscious person would expect to keep private, such as: addresses, phone numbers, e-mails, passwords, health details, relationship drama, financial numbers, political opinions, or sexual preferences.
89
-
90
- The motivating use-case for this model is to reside client-side (or in a trusted/internal environment) to review user-generated text content before it is sent to a server or third-party service, in order to prevent accidental sharing of sensitive information. For example:
91
- - Filter and act as an A:B router for public vs private LLMS (i.e. like using this with Pipelines in Open-WebUI). If the text is flagged as a privacy violation, it can be routed to a local/private LLM instance instead of a public one.
92
- - Block or warn users when they attempt to share sensitive information in chat applications
93
- - Load the model in a browser using libraries like ONNX.js or TensorFlow.js to perform client-side moderation
94
-
95
- ---
96
-
97
- > Ultimately, arguing that you don't care about the right to privacy because you have nothing to hide is no different than saying you don't care about free speech because you have nothing to say. - Edward Snowden
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model:
6
+ - google-bert/bert-large-uncased
7
+ metrics:
8
+ - accuracy
9
+ - f1
10
+ - precision
11
+ pipeline_tag: text-classification
12
+ tags:
13
+ - privacy
14
+ ---
15
+
16
+ <a href="https://www.buymeacoffee.com/privacymoderation"><img src="https://img.buymeacoffee.com/button-api/?text=Support / Buy me a coffee&emoji=☕&slug=privacymoderation&button_colour=FFDD00&font_colour=000000&font_family=Cookie&outline_colour=000000&coffee_colour=ffffff" /></a>
17
+
18
+ # Privacy Moderation Large
19
+
20
+ This is a BERT Large model fine-tuned to detect privacy violations in text, such as sharing of personally identifiable information (PII) or sensitive data. It is trained on a dataset of labeled examples of privacy violations and non-violations.
21
+
22
+
23
+ ## Performance
24
+
25
+ This large model achieves the following performance metrics on a held-out test set:
26
+ | Metric | Value |
27
+ |------------|---------|
28
+ | Accuracy | `0.9584` |
29
+ | F1 Score | `0.9569` |
30
+ | Precision | `0.9621` |
31
+ | Recall | `0.9517` |
32
+
33
+ These metrics indicate that the model is effective at identifying privacy violations while minimizing false positives.
34
+
35
+ ## Limitations
36
+
37
+ - The model was trained on a dataset of nearly 1 million examples in varying topics and styles, but may not generalize to all contexts
38
+ - It limited to English text
39
+ - This current iteration used a dataset where each example is between 20 and 120 words in length, so performance on much longer texts is untested (e.g. full documents may require chunking)
40
+ - The model may not detect all types of privacy violations, especially if they are subtle or context-dependent
41
+
42
+ ## How to Use
43
+
44
+ You can use this model for text classification tasks related to privacy moderation. Here's an example of how to use it with the Hugging Face Transformers library:
45
+
46
+ ```python
47
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
48
+ import torch
49
+ import numpy as np
50
+ import pandas as pd
51
+
52
+ # Load the model and tokenizer
53
+ model_name = "PL-RnD/privacy-moderation-large"
54
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
55
+ model = AutoModelForSequenceClassification.from_pretrained(model_name)
56
+ # Example text
57
+ texts = [
58
+ "Here is my credit card number: 1234-5678-9012-3456",
59
+ "This is a regular message without sensitive information.",
60
+ "For homeowners insurance, select deductibles from $500 to $2,500. Higher deductibles lower premiums.",
61
+ "Solidarity: My enrollment includes my kid's braces at $4,000 total—family strained. Push for orthodontic expansions. Email blast to reps starting now.",
62
+ ]
63
+ # Tokenize the input
64
+ inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True)
65
+ # Get model predictions
66
+ with torch.no_grad():
67
+ outputs = model(**inputs)
68
+
69
+ logits = outputs.logits
70
+ predictions = torch.argmax(logits, dim=-1)
71
+ # Convert predictions to labels
72
+ labels = ["non-violation", "violation"]
73
+ predicted_labels = [labels[pred] for pred in predictions.numpy()]
74
+ # Display results
75
+ df = pd.DataFrame({"text": texts, "label": predicted_labels})
76
+ print(df)
77
+ ```
78
+
79
+ This will output a DataFrame with the original texts and their predicted labels (either "violation" or "non-violation"). Example output:
80
+
81
+ ```
82
+ text label
83
+ 0 Here is my credit card number: 1234-5678-9012-... violation
84
+ 1 This is a regular message without sensitive in... non-violation
85
+ 2 For homeowners insurance, select deductibles f... non-violation
86
+ 3 Solidarity: My enrollment includes my kid's br... violation
87
+ ```
88
+
89
+ ## Intended Use
90
+ This model is intended to flag privacy concerns that a privacy conscious person would expect to keep private, such as: addresses, phone numbers, e-mails, passwords, health details, relationship drama, financial numbers, political opinions, or sexual preferences.
91
+
92
+ The motivating use-case for this model is to reside client-side (or in a trusted/internal environment) to review user-generated text content before it is sent to a server or third-party service, in order to prevent accidental sharing of sensitive information. For example:
93
+ - Filter and act as an A:B router for public vs private LLMS (i.e. like using this with Pipelines in Open-WebUI). If the text is flagged as a privacy violation, it can be routed to a local/private LLM instance instead of a public one.
94
+ - Block or warn users when they attempt to share sensitive information in chat applications
95
+ - Load the model in a browser using libraries like ONNX.js or TensorFlow.js to perform client-side moderation
96
+
97
+ ---
98
+
99
+ > Ultimately, arguing that you don't care about the right to privacy because you have nothing to hide is no different than saying you don't care about free speech because you have nothing to say. - Edward Snowden