|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: deepseek-ai/DeepSeek-R1-0528-Qwen3-8B |
|
|
tags: |
|
|
- cybersecurity |
|
|
- security |
|
|
- deepseek |
|
|
- fine-tuned |
|
|
- merged |
|
|
- text-generation |
|
|
datasets: |
|
|
- Trendyol/Trendyol-Cybersecurity-Instruction-Tuning-Dataset |
|
|
- AlicanKiraz0/Cybersecurity-Dataset-Fenrir-v2.0 |
|
|
language: |
|
|
- en |
|
|
pipeline_tag: text-generation |
|
|
library_name: transformers |
|
|
inference: true |
|
|
--- |
|
|
|
|
|
# DeepSeek-R1-Cybersecurity-8B-Merged |
|
|
|
|
|
This is the **merged** version of [sainikhiljuluri/DeepSeek-R1-Cybersecurity-8B](https://huggingface.co/sainikhiljuluri/DeepSeek-R1-Cybersecurity-8B), |
|
|
where the LoRA adapter has been merged into the base model for easier deployment. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
Fine-tuned **deepseek-ai/DeepSeek-R1-0528-Qwen3-8B** specialized for **cybersecurity** tasks. |
|
|
This merged model can be loaded directly without needing PEFT. |
|
|
|
|
|
## Training Details |
|
|
|
|
|
| Parameter | Value | |
|
|
|-----------|-------| |
|
|
| Base Model | deepseek-ai/DeepSeek-R1-0528-Qwen3-8B | |
|
|
| Training Samples | ~50,000 | |
|
|
| Epochs | 2 | |
|
|
| LoRA Rank | 16 | |
|
|
| LoRA Alpha | 32 | |
|
|
| Learning Rate | 2e-4 | |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Direct Loading |
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
import torch |
|
|
|
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
"sainikhiljuluri/DeepSeek-R1-Cybersecurity-8B-Merged", |
|
|
torch_dtype=torch.bfloat16, |
|
|
device_map="auto", |
|
|
trust_remote_code=True |
|
|
) |
|
|
tokenizer = AutoTokenizer.from_pretrained( |
|
|
"sainikhiljuluri/DeepSeek-R1-Cybersecurity-8B-Merged", |
|
|
trust_remote_code=True |
|
|
) |
|
|
|
|
|
prompt = "Explain how to detect SQL injection attacks." |
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
### Via Inference API |
|
|
```python |
|
|
import requests |
|
|
|
|
|
API_URL = "https://api-inference.huggingface.co/models/sainikhiljuluri/DeepSeek-R1-Cybersecurity-8B-Merged" |
|
|
headers = {"Authorization": "Bearer YOUR_HF_TOKEN"} |
|
|
|
|
|
response = requests.post(API_URL, headers=headers, json={ |
|
|
"inputs": "What are the indicators of a ransomware attack?", |
|
|
"parameters": {"max_new_tokens": 256, "temperature": 0.7} |
|
|
}) |
|
|
print(response.json()) |
|
|
``` |
|
|
|
|
|
## Cybersecurity Capabilities |
|
|
|
|
|
- 🔍 Threat analysis and classification |
|
|
- 🚨 Security alert triage |
|
|
- 📋 Incident response guidance |
|
|
- 🦠 Malware analysis |
|
|
- 📊 MITRE ATT&CK mapping |
|
|
- 🔐 Vulnerability assessment |
|
|
|