sarahwei/cyber_MITRE_tactic_CTI_dataset_v16
Viewer • Updated • 14k • 18 • 1
How to use sarahwei/MITRE-v16-tactic-bert-case-based with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="sarahwei/MITRE-v16-tactic-bert-case-based") # Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("sarahwei/MITRE-v16-tactic-bert-case-based")
model = AutoModelForSequenceClassification.from_pretrained("sarahwei/MITRE-v16-tactic-bert-case-based")It's a fine-tuned model from mitre-bert-base-cased on the MITRE ATT&CK version 16 procedure dataset.
You can use the fine-tuned model for text classification. It aims to identify the tactic that the sentence belongs to in MITRE ATT&CK framework. A sentence or an attack may fall into several tactics.
Note that this model is primarily fine-tuned on text classification for cybersecurity. It may not perform well if the sentence is not related to attacks.
You can use the model with Tensorflow.
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_id = "sarahwei/MITRE-v16-tactic-bert-case-based"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
)
question = 'An attacker performs a SQL injection.'
input_ids = tokenizer(question,return_tensors="pt")
outputs = model(**input_ids)
logits = outputs.logits
sigmoid = torch.nn.Sigmoid()
probs = sigmoid(logits.squeeze().cpu())
predictions = np.zeros(probs.shape)
predictions[np.where(probs >= 0.5)] = 1
predicted_labels = [model.config.id2label[idx] for idx, label in enumerate(predictions) if label == 1.0]
| Step | Training Loss | Validation Loss | Accuracy |
|---|---|---|---|
| 500 | 0.149800 | 0.061355 | 0.986081 |
| 1000 | 0.043700 | 0.046901 | 0.988223 |
| 1500 | 0.027700 | 0.043031 | 0.988707 |
Base model
google-bert/bert-base-cased