|
|
--- |
|
|
datasets: |
|
|
- sarahwei/cyber_MITRE_tactic_CTI_dataset_v16 |
|
|
language: |
|
|
- en |
|
|
metrics: |
|
|
- accuracy |
|
|
base_model: |
|
|
- bencyc1129/mitre-bert-base-cased |
|
|
pipeline_tag: text-classification |
|
|
library_name: transformers |
|
|
--- |
|
|
## MITRE-v16-tactic-bert-case-based |
|
|
|
|
|
It's a fine-tuned model from [mitre-bert-base-cased](https://huggingface.co/bencyc1129/mitre-bert-base-cased) on the MITRE ATT&CK version 16 procedure dataset. |
|
|
|
|
|
|
|
|
## Intended uses & limitations |
|
|
You can use the fine-tuned model for text classification. It aims to identify the tactic that the sentence belongs to in MITRE ATT&CK framework. |
|
|
A sentence or an attack may fall into several tactics. |
|
|
|
|
|
Note that this model is primarily fine-tuned on text classification for cybersecurity. |
|
|
It may not perform well if the sentence is not related to attacks. |
|
|
|
|
|
## How to use |
|
|
You can use the model with Tensorflow. |
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
import torch |
|
|
model_id = "sarahwei/MITRE-v16-tactic-bert-case-based" |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
model = AutoModelForSequenceClassification.from_pretrained( |
|
|
model_id, |
|
|
torch_dtype=torch.bfloat16, |
|
|
) |
|
|
question = 'An attacker performs a SQL injection.' |
|
|
input_ids = tokenizer(question,return_tensors="pt") |
|
|
outputs = model(**input_ids) |
|
|
logits = outputs.logits |
|
|
sigmoid = torch.nn.Sigmoid() |
|
|
probs = sigmoid(logits.squeeze().cpu()) |
|
|
predictions = np.zeros(probs.shape) |
|
|
predictions[np.where(probs >= 0.5)] = 1 |
|
|
predicted_labels = [model.config.id2label[idx] for idx, label in enumerate(predictions) if label == 1.0] |
|
|
``` |
|
|
|
|
|
## Training procedure |
|
|
### Training parameter |
|
|
- learning_rate: 2e-5 |
|
|
- train_batch_size: 32 |
|
|
- eval_batch_size: 32 |
|
|
- seed: 0 |
|
|
- num_epochs: 5 |
|
|
- warmup_ratio: 0.01 |
|
|
- weight_decay: 0.001 |
|
|
- optim: adamw_8bit |
|
|
|
|
|
### Training results |
|
|
- global_step=1755 |
|
|
- train_runtime: 315.2685 |
|
|
- train_samples_per_second: 177.722 |
|
|
- train_steps_per_second: 5.567 |
|
|
- total_flos: 7371850396784640.0 |
|
|
- train_loss: 0.06630994546787013 |
|
|
|
|
|
|
|
|
|Step| Training Loss| Validation Loss| Accuracy | |
|
|
|:--------:| :------------:|:----------:|:------------:| |
|
|
|500| 0.149800| 0.061355| 0.986081| |
|
|
1000| 0.043700| 0.046901| 0.988223| |
|
|
1500| 0.027700| 0.043031| 0.988707| |