YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
CyberSec-Qwen: Domain-Specialized LLM for Cybersecurity Intelligence
Overview
CyberSec-Qwen is a domain-adapted large language model fine-tuned on cybersecurity question-answering tasks. Built on top of Qwen2.5-1.5B-Instruct, this model is optimized to deliver accurate, structured, and context-aware responses to cybersecurity-related queries.
This model is designed for:
- Security analysts
- Students & learners in cybersecurity
- AI-powered security assistants
- SOC automation workflows
Model Details
| Attribute | Value |
|---|---|
| Base Model | Qwen2.5-1.5B-Instruct |
| Architecture | Transformer (Decoder-only) |
| Fine-tuning Method | QLoRA (4-bit) |
| Parameters | 1.5B |
| Trainable Params | |
| Context Length | 512 tokens |
| Precision | BF16 |
| Framework | Transformers + TRL + PEFT |
Training Pipeline
This model was fine-tuned using an optimized QLoRA pipeline for memory efficiency and scalability on limited hardware (T4 GPU).
Key techniques:
- 4-bit quantization (NF4)
- LoRA adaptation on attention + MLP layers
- Gradient checkpointing for memory optimization
- Cosine learning rate scheduling
- Early stopping for generalization control
- WandB tracking for experiment monitoring
Training setup:
- Epochs: 3
- Learning Rate: 2e-4
- Effective Batch Size: 8
- Sequence Length: 512
Dataset:
- Cybersecurity QA dataset
- Structured in instruction → response format
- Converted into chat template for alignment with Qwen architecture
Model Merging
After fine-tuning:
- LoRA adapters were merged into the base model
- Resulting in a standalone inference-ready model
This eliminates dependency on PEFT during deployment.
Inference Example
from transformers import pipeline
pipe = pipeline(
"text-generation",
model="niranjan2777/cybersec-qwen",
tokenizer="niranjan2777/cybersec-qwen",
device_map="auto"
)
response = pipe(
[{"role": "user", "content": "What is a zero-day exploit?"}],
max_new_tokens=200
)
print(response[0]["generated_text"])
⚠️ Limitations
- Limited to cybersecurity QA domain (may hallucinate outside domain)
- Trained on relatively small dataset
- Not suitable for real-time threat detection decisions
- Requires human validation for critical security operations
Safety Considerations
This model may generate:
- Security-related explanations that could be misused
- General vulnerability insights
Users must ensure:
- Ethical usage
- No malicious exploitation
- Deployment with proper safeguards
- Downloads last month
- 84
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support