tumeteor/Security-TTP-Mapping
Viewer • Updated • 20.7k • 1.22k • 30
How to use rootxhacker/mistralai-7B-attack2ttp with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")
model = PeftModel.from_pretrained(base_model, "rootxhacker/mistralai-7B-attack2ttp")This Model is built based on Mistral-7B which take attack scenario as input and it outputs techniques used by attacker
This Model is built based on Mistral-7B which take attack scenario as input and it outputs techniques used by attacker
[More Information Needed]
[More Information Needed]
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
import torch
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
peft_model_id = "rootxhacker/mistralai-7B-attack2ttp"
config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_4bit=True, device_map='auto')
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
def get_completion(query: str, model, tokenizer) -> str:
device = "cuda:0"
prompt_template = """
here is intruction you need to map Attack scenario with TTPs
### Question:
{query}
### Answer:
"""
prompt = prompt_template.format(query=query)
encodeds = tokenizer(prompt, return_tensors="pt", add_special_tokens=True)
model_inputs = encodeds.to(device)
generated_ids = model.generate(**model_inputs, max_new_tokens=1000, do_sample=True, pad_token_id=tokenizer.eos_token_id)
decoded = tokenizer.batch_decode(generated_ids)
return (decoded[0])
model = PeftModel.from_pretrained(model, peft_model_id)
[More Information Needed]
https://huggingface.co/datasets/tumeteor/Security-TTP-Mapping
[More Information Needed]
@inproceedings{nguyen-srndic-neth-ttpm,
title = "Noise Contrastive Estimation-based Matching Framework for Low-resource Security Attack Pattern Recognition",
author = "Nguyen, Tu and Šrndić, Nedim and Neth, Alexander",
booktitle = "Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics",
month = mar,
year = "2024",
publisher = "Association for Computational Linguistics",
abstract = "Tactics, Techniques and Procedures (TTPs) represent sophisticated attack patterns in the cybersecurity domain, described encyclopedically in textual knowledge bases. Identifying TTPs in cybersecurity writing, often called TTP mapping, is an important and challenging task. Conventional learning approaches often target the problem in the classical multi-class or multilabel classification setting. This setting hinders the learning ability of the model due to a large number of classes (i.e., TTPs), the inevitable skewness of the label distribution and the complex hierarchical structure of the label space. We formulate the problem in a different learning paradigm, where the assignment of a text to a TTP label is decided by the direct semantic similarity between the two, thus reducing the complexity of competing solely over the large labeling space. To that end, we propose a neural matching architecture with an effective sampling-based learn-to-compare mechanism, facilitating the learning process of the matching model despite constrained resources.",
}
Base model
mistralai/Mistral-7B-Instruct-v0.2
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2") model = PeftModel.from_pretrained(base_model, "rootxhacker/mistralai-7B-attack2ttp")