YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

CyberSec-Qwen: Domain-Specialized LLM for Cybersecurity Intelligence

Overview

CyberSec-Qwen is a domain-adapted large language model fine-tuned on cybersecurity question-answering tasks. Built on top of Qwen2.5-1.5B-Instruct, this model is optimized to deliver accurate, structured, and context-aware responses to cybersecurity-related queries.

This model is designed for:

  • Security analysts
  • Students & learners in cybersecurity
  • AI-powered security assistants
  • SOC automation workflows

Model Details

Attribute Value
Base Model Qwen2.5-1.5B-Instruct
Architecture Transformer (Decoder-only)
Fine-tuning Method QLoRA (4-bit)
Parameters 1.5B
Trainable Params 18M (1.2%)
Context Length 512 tokens
Precision BF16
Framework Transformers + TRL + PEFT

Training Pipeline

This model was fine-tuned using an optimized QLoRA pipeline for memory efficiency and scalability on limited hardware (T4 GPU).

Key techniques:

  • 4-bit quantization (NF4)
  • LoRA adaptation on attention + MLP layers
  • Gradient checkpointing for memory optimization
  • Cosine learning rate scheduling
  • Early stopping for generalization control
  • WandB tracking for experiment monitoring

Training setup:

  • Epochs: 3
  • Learning Rate: 2e-4
  • Effective Batch Size: 8
  • Sequence Length: 512

Dataset:

  • Cybersecurity QA dataset
  • Structured in instruction → response format
  • Converted into chat template for alignment with Qwen architecture

Model Merging

After fine-tuning:

  • LoRA adapters were merged into the base model
  • Resulting in a standalone inference-ready model

This eliminates dependency on PEFT during deployment.


Inference Example

from transformers import pipeline

pipe = pipeline(
    "text-generation",
    model="niranjan2777/cybersec-qwen",
    tokenizer="niranjan2777/cybersec-qwen",
    device_map="auto"
)

response = pipe(
    [{"role": "user", "content": "What is a zero-day exploit?"}],
    max_new_tokens=200
)

print(response[0]["generated_text"])

⚠️ Limitations

  • Limited to cybersecurity QA domain (may hallucinate outside domain)
  • Trained on relatively small dataset
  • Not suitable for real-time threat detection decisions
  • Requires human validation for critical security operations

Safety Considerations

This model may generate:

  • Security-related explanations that could be misused
  • General vulnerability insights

Users must ensure:

  • Ethical usage
  • No malicious exploitation
  • Deployment with proper safeguards

Downloads last month
84
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using niranjan2777/cybersec-qwen 1