sarahwei
/

MITRE-v16-tactic-bert-case-based

Text Classification

text-embeddings-inference

Model card Files Files and versions

sarahwei commited on Feb 7, 2025

Commit

3ff2afa

·

verified ·

1 Parent(s): b24969a

Create README.md

Files changed (1) hide show

README.md +72 -0

README.md ADDED Viewed

	@@ -0,0 +1,72 @@

+---
+datasets:
+- sarahwei/cyber_MITRE_tactic_CTI_dataset_v16
+language:
+- en
+metrics:
+- accuracy
+base_model:
+- bencyc1129/mitre-bert-base-cased
+pipeline_tag: text-classification
+library_name: transformers
+---
+## MITRE-v16-tactic-bert-case-based
+It's a fine-tuned model from [mitre-bert-base-cased](https://huggingface.co/bencyc1129/mitre-bert-base-cased) on the MITRE ATT&CK version 16 procedure dataset.
+## Intended uses & limitations
+You can use the fine-tuned model for text classification. It aims to identify the tactic that the sentence belongs to in MITRE ATT&CK framework.
+A sentence or an attack may fall into several tactics.
+Note that this model is primarily fine-tuned on text classification for cybersecurity.
+It may not perform well if the sentence is not related to attacks.
+## How to use
+You can use the model with Tensorflow.
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+model_id = "sarahwei/MITRE-v16-tactic-bert-case-based"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForSequenceClassification.from_pretrained(
+    model_id,
+    torch_dtype=torch.bfloat16,
+)
+question = 'An attacker performs a SQL injection.'
+input_ids = tokenizer(question,return_tensors="pt")
+outputs = model(**input_ids)
+logits = outputs.logits
+sigmoid = torch.nn.Sigmoid()
+probs = sigmoid(logits.squeeze().cpu())
+predictions = np.zeros(probs.shape)
+predictions[np.where(probs >= 0.5)] = 1
+predicted_labels = [model.config.id2label[idx] for idx, label in enumerate(predictions) if label == 1.0]
+```
+## Training procedure
+### Training parameter
+- learning_rate: 2e-5
+- train_batch_size: 32
+- eval_batch_size: 32
+- seed: 0
+- num_epochs: 5
+- warmup_ratio: 0.01
+- weight_decay: 0.001
+- optim: adamw_8bit
+### Training results
+- global_step=1755
+- train_runtime: 315.2685
+- train_samples_per_second: 177.722
+- train_steps_per_second: 5.567
+- total_flos: 7371850396784640.0
+- train_loss: 0.06630994546787013
+|Step| Training Loss| Validation Loss| Accuracy |
+|:--------:| :------------:|:----------:|:------------:|
+|500|	0.149800|	0.061355|		0.986081|
+1000|	0.043700|	0.046901|		0.988223|
+1500|	0.027700|	0.043031|		0.988707|