jainsatyam26 commited on
Commit
dfb99c0
·
verified ·
1 Parent(s): a2190be

Auto-deploy: Step 400 | F1: N/A

Browse files
README.md ADDED
@@ -0,0 +1,145 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: mit
4
+ tags:
5
+ - content-safety
6
+ - text-classification
7
+ - safety
8
+ - moderation
9
+ - deberta
10
+ datasets:
11
+ - jainsatyam26/guardrail-215k-splits
12
+ metrics:
13
+ - f1
14
+ - accuracy
15
+ widget:
16
+ - text: "How to make a bomb?"
17
+ example_title: "Violent Content"
18
+ - text: "Hello, how are you?"
19
+ example_title: "Safe Content"
20
+ ---
21
+
22
+ # High-Accuracy Content Safety Classifier
23
+
24
+ This model is a fine-tuned DeBERTa classifier for content safety, achieving high performance on safety classification tasks.
25
+
26
+ ## Model Details
27
+
28
+ - **Base Model**: microsoft/deberta-v3-large
29
+ - **Training Dataset**: jainsatyam26/guardrail-215k-splits
30
+ - **Categories**: 10 safety categories
31
+ - **Training Time**: Auto-deployed during training
32
+ - **Last Updated**: 2026-04-28 12:47:44 UTC
33
+
34
+
35
+ ## Performance
36
+
37
+ | Metric | Value |
38
+ |--------|-------|
39
+ | F1 Score | N/A |
40
+ | Accuracy | N/A |
41
+ | Unsafe F1 | N/A |
42
+
43
+
44
+ ## Categories
45
+
46
+ - `benign`
47
+ - `jailbreak`
48
+ - `S1 Violent Crimes`
49
+ - `S2 Non-Violent Crimes`
50
+ - `S4 Child Sexual Exploitation`
51
+ - `S7 Privacy`
52
+ - `S10 Hate`
53
+ - `S11 Self-Harm`
54
+ - `S12 Sexual Content`
55
+ - `S14 Code Abuse`
56
+
57
+ ## Usage
58
+
59
+ ```python
60
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
61
+ import torch
62
+
63
+ tokenizer = AutoTokenizer.from_pretrained("jainsatyam26/bertclassfier")
64
+ model = AutoModelForSequenceClassification.from_pretrained("jainsatyam26/bertclassfier")
65
+
66
+ def predict(text):
67
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
68
+ with torch.no_grad():
69
+ outputs = model(**inputs)
70
+ probs = torch.softmax(outputs.logits, dim=-1)
71
+ predicted_id = torch.argmax(probs, dim=-1).item()
72
+
73
+ labels = ['benign', 'jailbreak', 'S1 Violent Crimes', 'S2 Non-Violent Crimes', 'S4 Child Sexual Exploitation', 'S7 Privacy', 'S10 Hate', 'S11 Self-Harm', 'S12 Sexual Content', 'S14 Code Abuse']
74
+ return {
75
+ "prediction": labels[predicted_id],
76
+ "confidence": probs[0][predicted_id].item(),
77
+ "all_scores": {labels[i]: probs[0][i].item() for i in range(len(labels))}
78
+ }
79
+
80
+ # Example
81
+ result = predict("How to make a bomb?")
82
+ print(result)
83
+ ```
84
+
85
+ ## Training Configuration
86
+
87
+ This model was trained with the following configuration:
88
+
89
+ ```json
90
+ {
91
+ "model_name": "microsoft/deberta-v3-large",
92
+ "dataset_name": "jainsatyam26/guardrail-215k-splits",
93
+ "max_length": 512,
94
+ "epochs": 4,
95
+ "batch_size": 8,
96
+ "grad_accum": 4,
97
+ "learning_rate": 1e-05,
98
+ "weight_decay": 0.01,
99
+ "warmup_ratio": 0.1,
100
+ "use_llrd": true,
101
+ "llrd_alpha": 0.9,
102
+ "use_multisample_dropout": true,
103
+ "num_dropout_samples": 5,
104
+ "dropout_rate": 0.3,
105
+ "use_label_smoothing": true,
106
+ "label_smoothing": 0.1,
107
+ "use_focal_loss": true,
108
+ "focal_alpha": 0.7,
109
+ "focal_gamma": 2.0,
110
+ "use_hard_negative": true,
111
+ "hard_negative_ratio": 0.3,
112
+ "num_folds": 3,
113
+ "optimize_thresholds": true,
114
+ "output_dir": "./guardrail_model",
115
+ "checkpoint_steps": 500,
116
+ "logging_steps": 50,
117
+ "eval_steps": 500,
118
+ "hf_repo_id": "jainsatyam26/bertclassfier",
119
+ "hf_token": "***REDACTED***",
120
+ "deploy_every_minutes": 30,
121
+ "deploy_every_steps": 400,
122
+ "auto_deploy": true,
123
+ "private_repo": false,
124
+ "auto_resume": true,
125
+ "resume_from_hf": true,
126
+ "use_wandb": true,
127
+ "wandb_project": "safety-classifier",
128
+ "fp16": false,
129
+ "bf16": true,
130
+ "dataloader_num_workers": 4,
131
+ "seed": 42
132
+ }
133
+ ```
134
+
135
+ ## Automatic Deployment
136
+
137
+ This model is automatically deployed every 30 minutes during training with:
138
+ - ✅ Automatic checkpoint recovery
139
+ - ✅ Real-time performance monitoring
140
+ - ✅ Progressive model updates
141
+ - ✅ Training state persistence
142
+
143
+ ---
144
+
145
+ *Generated automatically during training - 2026-04-28 12:47:44*
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:facfd34b08de2425f8194dc6057f8cf9d11294a2b74802a97ae5485e46387534
3
+ size 870381672
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": true,
3
+ "backend": "tokenizers",
4
+ "bos_token": "[CLS]",
5
+ "cls_token": "[CLS]",
6
+ "do_lower_case": false,
7
+ "eos_token": "[SEP]",
8
+ "extra_special_tokens": [
9
+ "[PAD]",
10
+ "[CLS]",
11
+ "[SEP]"
12
+ ],
13
+ "is_local": false,
14
+ "local_files_only": false,
15
+ "mask_token": "[MASK]",
16
+ "model_max_length": 1000000000000000019884624838656,
17
+ "pad_token": "[PAD]",
18
+ "sep_token": "[SEP]",
19
+ "split_by_punct": false,
20
+ "tokenizer_class": "DebertaV2Tokenizer",
21
+ "unk_id": 3,
22
+ "unk_token": "[UNK]",
23
+ "vocab_type": "spm"
24
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6cc005448bf0c866f37e672ed9ad1c82fc815ae0f114088ffaa1fe7114813fd6
3
+ size 5265
training_config.json ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_name": "microsoft/deberta-v3-large",
3
+ "dataset_name": "jainsatyam26/guardrail-215k-splits",
4
+ "max_length": 512,
5
+ "epochs": 4,
6
+ "batch_size": 8,
7
+ "grad_accum": 4,
8
+ "learning_rate": 1e-05,
9
+ "weight_decay": 0.01,
10
+ "warmup_ratio": 0.1,
11
+ "use_llrd": true,
12
+ "llrd_alpha": 0.9,
13
+ "use_multisample_dropout": true,
14
+ "num_dropout_samples": 5,
15
+ "dropout_rate": 0.3,
16
+ "use_label_smoothing": true,
17
+ "label_smoothing": 0.1,
18
+ "use_focal_loss": true,
19
+ "focal_alpha": 0.7,
20
+ "focal_gamma": 2.0,
21
+ "use_hard_negative": true,
22
+ "hard_negative_ratio": 0.3,
23
+ "num_folds": 3,
24
+ "optimize_thresholds": true,
25
+ "output_dir": "./guardrail_model",
26
+ "checkpoint_steps": 500,
27
+ "logging_steps": 50,
28
+ "eval_steps": 500,
29
+ "hf_repo_id": "jainsatyam26/bertclassfier",
30
+ "hf_token": "***REDACTED***",
31
+ "deploy_every_minutes": 30,
32
+ "deploy_every_steps": 400,
33
+ "auto_deploy": true,
34
+ "private_repo": false,
35
+ "auto_resume": true,
36
+ "resume_from_hf": true,
37
+ "use_wandb": true,
38
+ "wandb_project": "safety-classifier",
39
+ "fp16": false,
40
+ "bf16": true,
41
+ "dataloader_num_workers": 4,
42
+ "seed": 42
43
+ }
training_state.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "step": 400,
3
+ "epoch": 0.0844550013196094,
4
+ "best_metric": null,
5
+ "total_flos": 0.0
6
+ }