BlueSecureBERT / README.md
HagalazAI's picture
Update README.md
3b7b475 verified
|
raw
history blame
5.04 kB
metadata
tags:
  - text-classification
  - security
  - blue-team
  - roberta
license: odc-by
datasets:
  - trendmicro-ailab/Primus-FineWeb
metrics:
  - precision
  - recall
  - f1
pipeline_tag: text-classification
library_name: transformers

BlueSecureBERT 🟦🛡️

Detects blue-team / defensive security text (English), with a focus on technical detection-engineering workflows (SIEM ingestion, Sigma rules, alert-tuning).

Split Precision Recall F1 F₂ CE-loss Threshold
Validation 0.949 0.991 0.969 0.982 0.011 0.579

Recommended cut-off: prob >= 0.579 (arg-max on the validation split)

Demo

Phrase Blue Score
To exfiltrate sensitive data, launch a phishing campaign that tricks employees into revealing their VPN credentials. 0.066
We should deploy an EDR solution, monitor all endpoints for intrusion attempts, and enforce strict password policies. 0.557
Our marketing team will unveil the new cybersecurity branding materials at next Tuesday’s antivirus product launch 0.256
I'm excited about the company picnic. There's no cybersecurity topic—just burgers and games. 0.272

Intended uses & limits

  • Triage large corpora for techial detection engineering, sysmon, sigma, SIEM, indicators of compromise related data.
  • Input language: English
  • No external test set yet → treat numbers as optimistic

Training data

Label Rows
Offensive 30 746
Defensive 19 550
Other 130 000
Total 180 296

Model details

Field Value
Base encoder ehsanaghaei/SecureBERT (RoBERTa-base, 125 M)
Objective One-vs-rest, focal-loss (γ = 2)
Training 3 epochs · micro-batch 16 · LR 2e-5
Hardware 1× RTX 4090 (≈ 41 min)
Inference dtype FP16-safe

Training Data License

Quick start

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

def classify_texts(model_name, phrases, threshold=0.515):
    """
    Returns a list of (probability_offensive, label) tuples for each phrase
    given a model_name and threshold.
    """
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForSequenceClassification.from_pretrained(model_name)
    model.eval()

    inputs = tokenizer(phrases, padding=True, truncation=True, return_tensors="pt")

    with torch.no_grad():
        logits = model(**inputs).logits  # shape: (batch_size, 2)
        probs_offensive = torch.softmax(logits, dim=1)[:, 1]  # Probability of the "Offensive" class

    results = []
    for p_val in probs_offensive:
        p_val = p_val.item()
        label = "Offensive (red-team)" if p_val >= threshold else "Not Offensive"
        results.append((p_val, label))
    return results

def main():
    # Example phrases: Offensive (red-team), Defensive (blue-team), Non-technical
    phrases = [
        # 1) Cybersecurity Offensive / red-team
        "To exfiltrate sensitive data, launch a phishing campaign that tricks employees into revealing their VPN credentials.",
        # 2) Cybersecurity Defensive / blue-team
        "We should deploy an EDR solution, monitor all endpoints for intrusion attempts, and enforce strict password policies.",
        # 5) Cybersecruity Marketing
        "“Our marketing team will unveil the new cybersecurity branding materials at next Tuesday’s antivirus product launch",
        # 5) Non Cybersecruity  related
        "I'm excited about the company picnic. There's no cybersecurity topic—just burgers and games."
    ]

    # Classify with both models
    threshold = 0.515
    blue_results = classify_texts("HagalazAI/BlueSecureBERT", phrases, threshold)
    red_results = classify_texts("HagalazAI/RedSecureBERT", phrases, threshold)

    # Print a Markdown table
    print("| # | Phrase | Blue Score | Blue Label | Red Score | Red Label |")
    print("|---|--------|-----------|-----------|----------|----------|")
    for i, text in enumerate(phrases, start=1):
        blue_score, blue_label = blue_results[i - 1]
        red_score, red_label = red_results[i - 1]
        print(f"| {i} | {text} | {blue_score:.3f} | {blue_label} | {red_score:.3f} | {red_label} |")

if __name__ == "__main__":
    main()