HagalazAI
/

BlueSecureBERT

@@ -25,7 +25,14 @@ Detects **blue-team / defensive security** text (English), with a focus on **tec
 > **Recommended cut-off:** `prob >= 0.579` (arg-max on the validation split)
----
 ## Intended uses & limits
@@ -33,8 +40,6 @@ Detects **blue-team / defensive security** text (English), with a focus on **tec
 * **Input language:** English
 * **No external test set** yet → treat numbers as optimistic
----
 ## Training data
 | Label     | Rows    |
@@ -44,8 +49,6 @@ Detects **blue-team / defensive security** text (English), with a focus on **tec
 | Other     | 130 000 |
 | **Total** | **180 296** |
----
 ## Model details
 | Field          | Value                                                |
@@ -56,8 +59,6 @@ Detects **blue-team / defensive security** text (English), with a focus on **tec
 | Hardware       | 1× RTX 4090 (≈ 41 min)                               |
 | Inference dtype| FP16-safe                                            |
----
 ## Training Data License
 - **Source**: [trendmicro-ailab/Primus-FineWeb](https://huggingface.co/datasets/trendmicro-ailab/Primus-FineWeb)
@@ -66,21 +67,59 @@ Detects **blue-team / defensive security** text (English), with a focus on **tec
   - Preserve all original copyright/license notices
   - Honor [Common Crawl ToU](https://commoncrawl.org/terms-of-use/)
----
 ## Quick start
 ```python
-from transformers import pipeline, AutoModelForSequenceClassification, AutoTokenizer
-model_id = "HagalazAI/BlueSecureBERT"
-tok      = AutoTokenizer.from_pretrained(model_id)
-model    = AutoModelForSequenceClassification.from_pretrained(model_id)
-clf      = pipeline("text-classification", model=model, tokenizer=tok, top_k=None)
-text = "Investigate potential SQL injection vulnerabilities."
-prob = clf(text)[0]["score"]  # sigmoid prob for class 0 (Defensive)
-print(f"P(defensive) = {prob:.3f}")
-is_blue = prob >=  0.579        # ← recommended threshold
-print("is_blue:", is_blue)

 > **Recommended cut-off:** `prob >= 0.579` (arg-max on the validation split)
+## Demo
+| Phrase | Blue Score |
+|--------|------------|
+| To exfiltrate sensitive data, launch a phishing campaign that tricks employees into revealing their VPN credentials. | 0.066 |
+| We should deploy an EDR solution, monitor all endpoints for intrusion attempts, and enforce strict password policies. | 0.557 |
+| Our marketing team will unveil the new cybersecurity branding materials at next Tuesday’s antivirus product launch | 0.256 |
+| I'm excited about the company picnic. There's no cybersecurity topic—just burgers and games. | 0.272 |
 ## Intended uses & limits
 * **Input language:** English
 * **No external test set** yet → treat numbers as optimistic
 ## Training data
 | Label     | Rows    |
 | Other     | 130 000 |
 | **Total** | **180 296** |
 ## Model details
 | Field          | Value                                                |
 | Hardware       | 1× RTX 4090 (≈ 41 min)                               |
 | Inference dtype| FP16-safe                                            |
 ## Training Data License
 - **Source**: [trendmicro-ailab/Primus-FineWeb](https://huggingface.co/datasets/trendmicro-ailab/Primus-FineWeb)
   - Preserve all original copyright/license notices
   - Honor [Common Crawl ToU](https://commoncrawl.org/terms-of-use/)
 ## Quick start
 ```python
+import torch
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+def classify_texts(model_name, phrases, threshold=0.515):
+    """
+    Returns a list of (probability_offensive, label) tuples for each phrase
+    given a model_name and threshold.
+    """
+    tokenizer = AutoTokenizer.from_pretrained(model_name)
+    model = AutoModelForSequenceClassification.from_pretrained(model_name)
+    model.eval()
+    inputs = tokenizer(phrases, padding=True, truncation=True, return_tensors="pt")
+    with torch.no_grad():
+        logits = model(**inputs).logits  # shape: (batch_size, 2)
+        probs_offensive = torch.softmax(logits, dim=1)[:, 1]  # Probability of the "Offensive" class
+    results = []
+    for p_val in probs_offensive:
+        p_val = p_val.item()
+        label = "Offensive (red-team)" if p_val >= threshold else "Not Offensive"
+        results.append((p_val, label))
+    return results
+def main():
+    # Example phrases: Offensive (red-team), Defensive (blue-team), Non-technical
+    phrases = [
+        # 1) Cybersecurity Offensive / red-team
+        "To exfiltrate sensitive data, launch a phishing campaign that tricks employees into revealing their VPN credentials.",
+        # 2) Cybersecurity Defensive / blue-team
+        "We should deploy an EDR solution, monitor all endpoints for intrusion attempts, and enforce strict password policies.",
+        # 5) Cybersecruity Marketing
+        "“Our marketing team will unveil the new cybersecurity branding materials at next Tuesday’s antivirus product launch",
+        # 5) Non Cybersecruity  related
+        "I'm excited about the company picnic. There's no cybersecurity topic—just burgers and games."
+    ]
+    # Classify with both models
+    threshold = 0.515
+    blue_results = classify_texts("HagalazAI/BlueSecureBERT", phrases, threshold)
+    red_results = classify_texts("HagalazAI/RedSecureBERT", phrases, threshold)
+    # Print a Markdown table
+    print("| # | Phrase | Blue Score | Blue Label | Red Score | Red Label |")
+    print("|---|--------|-----------|-----------|----------|----------|")
+    for i, text in enumerate(phrases, start=1):
+        blue_score, blue_label = blue_results[i - 1]
+        red_score, red_label = red_results[i - 1]
+        print(f"| {i} | {text} | {blue_score:.3f} | {blue_label} | {red_score:.3f} | {red_label} |")
+if __name__ == "__main__":
+    main()