File size: 2,201 Bytes
3ff2afa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
---
datasets:
- sarahwei/cyber_MITRE_tactic_CTI_dataset_v16
language:
- en
metrics:
- accuracy
base_model:
- bencyc1129/mitre-bert-base-cased
pipeline_tag: text-classification
library_name: transformers
---
## MITRE-v16-tactic-bert-case-based

It's a fine-tuned model from [mitre-bert-base-cased](https://huggingface.co/bencyc1129/mitre-bert-base-cased) on the MITRE ATT&CK version 16 procedure dataset.


## Intended uses & limitations
You can use the fine-tuned model for text classification. It aims to identify the tactic that the sentence belongs to in MITRE ATT&CK framework. 
A sentence or an attack may fall into several tactics. 

Note that this model is primarily fine-tuned on text classification for cybersecurity.
It may not perform well if the sentence is not related to attacks. 

## How to use
You can use the model with Tensorflow.
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_id = "sarahwei/MITRE-v16-tactic-bert-case-based"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
)
question = 'An attacker performs a SQL injection.'
input_ids = tokenizer(question,return_tensors="pt")
outputs = model(**input_ids)
logits = outputs.logits
sigmoid = torch.nn.Sigmoid()
probs = sigmoid(logits.squeeze().cpu())
predictions = np.zeros(probs.shape)
predictions[np.where(probs >= 0.5)] = 1
predicted_labels = [model.config.id2label[idx] for idx, label in enumerate(predictions) if label == 1.0]
```

## Training procedure
### Training parameter
- learning_rate: 2e-5
- train_batch_size: 32
- eval_batch_size: 32
- seed: 0
- num_epochs: 5
- warmup_ratio: 0.01
- weight_decay: 0.001
- optim: adamw_8bit

### Training results
- global_step=1755
- train_runtime: 315.2685
- train_samples_per_second: 177.722
- train_steps_per_second: 5.567
- total_flos: 7371850396784640.0
- train_loss: 0.06630994546787013


|Step| Training Loss| Validation Loss| Accuracy |     
|:--------:| :------------:|:----------:|:------------:|
|500|	0.149800|	0.061355|		0.986081|
1000|	0.043700|	0.046901|		0.988223|
1500|	0.027700|	0.043031|		0.988707|