HagalazAI
/

RedSecureBERT

Text Classification

Model card Files Files and versions

HagalazAI commited on Apr 29, 2025

Commit

f261973

·

verified ·

1 Parent(s): 2e30d56

Update README.md

Files changed (1) hide show

README.md +79 -3

README.md CHANGED Viewed

@@ -1,3 +1,79 @@
----
-license: apache-2.0
----

+---
+tags:
+- text-classification
+- security
+- red-team
+- roberta
+license: apache-2.0
+datasets:
+- trendmicro-ailab/Primus-FineWeb
+metrics:
+- precision
+- recall
+- f1
+pipeline_tag: text-classification
+library_name: transformers
+---
+# RedSecureBERT 🔴🛡️
+Detects **red-team / offensive security** text (English).
+| Split | Precision | Recall | F1 | Threshold |
+|-------|-----------|--------|----|-----------|
+| Validation | **0.963** | **0.991** | **0.977** | **0.515** |
+> **Recommended cut-off:** `prob >= 0.515` (chosen via F₂ on the validation split).
+---
+## Intended uses & limits
+* **Triaging** large corpora, chat logs, or bug-bounty reports.
+* **Input language:** English.
+* **No external test set** yet → treat scores as optimistic.
+---
+## Training data (quick view)
+| Label | Rows |
+|-------|------|
+| Offensive | 30 746 |
+| Defensive | 19 550 |
+| Other | 130 000 |
+| **Total** | **180 296** |
+Source: *Primus-FineWeb* (filtered & hand-labelled).
+---
+## Model details
+| Field | Value |
+|-------|-------|
+| Base encoder | `ehsanaghaei/SecureBERT` (RoBERTa-base, 125 M) |
+| Objective | One-vs-rest, focal-loss (γ = 2) |
+| Epochs | 3 &nbsp;·&nbsp; micro-batch 16 &nbsp;·&nbsp; LR 2e-5 |
+| Hardware | 1× RTX 4090 (≈ 41 min) |
+| Inference dtype | FP16-safe |
+---
+## Quick start
+```python
+from transformers import pipeline, AutoModelForSequenceClassification, AutoTokenizer
+model_id = "HagalazAI/RedSecureBERT"
+tok   = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForSequenceClassification.from_pretrained(model_id)
+clf = pipeline("text-classification", model=model, tokenizer=tok, top_k=None)
+text = "Generate a ROP chain to bypass DEP on Windows 10."
+prob = clf(text)[0]["score"]      # sigmoid prob for class 0 (Offensive)
+print(f"P(offensive) = {prob:.3f}")
+is_red = prob >= 0.515            # ← recommended threshold
+print("is_red:", is_red)