CyberSec-Qwen: Domain-Specialized LLM for Cybersecurity Intelligence

Overview

CyberSec-Qwen is a domain-adapted large language model fine-tuned on cybersecurity question-answering tasks. Built on top of Qwen2.5-1.5B-Instruct, this model is optimized to deliver accurate, structured, and context-aware responses to cybersecurity-related queries.

This model is designed for:

Security analysts
Students & learners in cybersecurity
AI-powered security assistants
SOC automation workflows

Model Details

Attribute	Value
Base Model	Qwen2.5-1.5B-Instruct
Architecture	Transformer (Decoder-only)
Fine-tuning Method	QLoRA (4-bit)
Parameters	1.5B
Trainable Params	~~18M (~~1.2%)
Context Length	512 tokens
Precision	BF16
Framework	Transformers + TRL + PEFT

Training Pipeline

This model was fine-tuned using an optimized QLoRA pipeline for memory efficiency and scalability on limited hardware (T4 GPU).

Key techniques:

4-bit quantization (NF4)
LoRA adaptation on attention + MLP layers
Gradient checkpointing for memory optimization
Cosine learning rate scheduling
Early stopping for generalization control
WandB tracking for experiment monitoring

Training setup:

Epochs: 3
Learning Rate: 2e-4
Effective Batch Size: 8
Sequence Length: 512

Dataset:

Cybersecurity QA dataset
Structured in instruction → response format
Converted into chat template for alignment with Qwen architecture

Model Merging

After fine-tuning:

LoRA adapters were merged into the base model
Resulting in a standalone inference-ready model

This eliminates dependency on PEFT during deployment.

Inference Example

from transformers import pipeline

pipe = pipeline(
    "text-generation",
    model="niranjan2777/cybersec-qwen",
    tokenizer="niranjan2777/cybersec-qwen",
    device_map="auto"
)

response = pipe(
    [{"role": "user", "content": "What is a zero-day exploit?"}],
    max_new_tokens=200
)

print(response[0]["generated_text"])

⚠️ Limitations

Limited to cybersecurity QA domain (may hallucinate outside domain)
Trained on relatively small dataset
Not suitable for real-time threat detection decisions
Requires human validation for critical security operations

Safety Considerations

This model may generate:

Security-related explanations that could be misused
General vulnerability insights

Users must ensure:

Ethical usage
No malicious exploitation
Deployment with proper safeguards

Downloads last month: 7

Safetensors

Model size

2B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

niranjan2777
/

cybersec-qwen