🛡️ kwCyber-AI-Agent

The First Arabic-English Cybersecurity AI Agent

Built by Nalzankii 🇰🇼

Specialized AI Agent for Cybersecurity Education, Penetration Testing Guidance, and Threat Analysis

🎯 Model Overview

kwCyber-AI-Agent is a purpose-built cybersecurity AI agent designed to serve as an intelligent assistant for security professionals, students, and enthusiasts in the Arab world and beyond. Unlike general-purpose LLMs, this model is exclusively trained on cybersecurity knowledge, making it a domain expert.

Key Highlights

🌐 Bilingual: Native Arabic and English cybersecurity expertise
🔧 Tool-Use: Can execute security tools (Nmap, Nikto, CVE lookup, etc.)
🎓 Educational: Personalized cybersecurity learning paths
🏴 CTF Expert: Challenge walkthroughs and guidance
🔍 Threat Analysis: Real-time vulnerability assessment
🇰🇼 Made in Kuwait: Built for the Kuwait Cyber platform

🧠 Model Details

Attribute	Details
Developer	Nalzankii (Kuwait Cyber)
Base Architecture	Llama 4 (Transformer)
Parameters	~7B
Context Length	8,192 tokens
Languages	Arabic (ar), English (en)
License	Llama 4 Community License
Domain	Cybersecurity
Training Stage	🚧 In Development

Architecture

kwCyber-AI-Agent
├── Base: Llama 4 Architecture
├── Custom Tokenizer (64K vocab, Arabic-optimized)
├── Fine-tuned on Cybersecurity Corpus
├── Tool-Use Layer (Function Calling)
├── Safety Guardrails (Content Filtering)
└── RAG Integration (Knowledge Base)

🎓 Intended Uses

Primary Use Cases

Cybersecurity Q&A — Ask security questions in Arabic or English
Vulnerability Analysis — Analyze CVEs, understand attack vectors
CTF Assistance — Get hints and walkthroughs for CTF challenges
Learning Mentor — Personalized study plans for security certifications
Tool Guidance — Learn how to use security tools effectively
Threat Intelligence — Stay updated on latest threats and vulnerabilities

Example Interactions

Arabic:

User: ما هو هجوم SQL Injection وكيف أحمي تطبيقي منه؟

kwCyber-AI-Agent: هجوم SQL Injection هو نوع من الهجمات يستغل ثغرات في التطبيقات 
التي تتعامل مع قواعد البيانات. المهاجم يقوم بحقن أوامر SQL خبيثة من خلال حقول 
الإدخال...

للحماية:
1. استخدم Prepared Statements (استعلامات محضّرة)
2. فعّل Input Validation
3. طبّق مبدأ Least Privilege على قاعدة البيانات
4. استخدم WAF (Web Application Firewall)

English:

User: How do I perform a basic Nmap scan on a target?

kwCyber-AI-Agent: Here's how to use Nmap for reconnaissance:

Basic scan: nmap -sV -sC target_ip
- -sV: Version detection
- -sC: Default scripts

⚠️ Important: Only scan targets you have explicit authorization to test.

Tool Use:

User: Scan example.com for open ports

kwCyber-AI-Agent: [Calling: scan_target]
{
  "target": "example.com",
  "scan_type": "port",
  "intensity": "normal"
}

Results: Found 3 open ports...

🏋️ Training

Training Stages

Stage	Description	Data Size	Status
Stage 1	Continued Pre-training on Cybersecurity Corpus	100GB+ text	🔜 Planned
Stage 2	Supervised Fine-Tuning (SFT)	200K+ Q&A pairs	🔜 Planned
Stage 3	DPO Alignment	10K+ preference pairs	🔜 Planned
Stage 4	Tool-Use Training	20K+ function calls	🔜 Planned

Training Data Sources

MITRE ATT&CK — Adversary tactics and techniques
NVD/CVE Database — Vulnerability records
OWASP — Web application security
CWE — Common weakness patterns
CTF Writeups — HackTheBox, TryHackMe, PicoCTF
Security Research Papers — IEEE, ACM, arXiv
Custom Arabic Dataset — Translated and original Arabic security content

Training Configuration

training:
  base_model: meta-llama/Llama-4-Scout-17B-16E-Instruct
  method: QLoRA
  lora_r: 64
  lora_alpha: 128
  learning_rate: 2e-5
  batch_size: 16
  gradient_accumulation: 4
  epochs: 3
  warmup_ratio: 0.1
  optimizer: adamw_torch
  scheduler: cosine
  max_seq_length: 8192
  precision: bf16

🔧 Agent Capabilities

Supported Tools

Tool	Capability	Integration
🔍 Nmap	Port scanning & service detection	CLI Wrapper
🌐 Nikto	Web vulnerability scanning	CLI Wrapper
🗄️ SQLmap	SQL injection testing	Sandboxed
🦠 VirusTotal	Malware analysis	REST API
🌍 Shodan	Internet-wide scanning	REST API
📋 CVE API	Vulnerability lookup	REST API
🔎 WHOIS	Domain information	Python Lib
📡 DNS	DNS reconnaissance	Python Lib

Function Calling Format

{
  "name": "scan_target",
  "description": "Perform a security scan on target",
  "parameters": {
    "target": {
      "type": "string",
      "description": "IP address or domain"
    },
    "scan_type": {
      "type": "string",
      "enum": ["port", "vuln", "web", "dns", "full"]
    },
    "intensity": {
      "type": "string",
      "enum": ["light", "normal", "aggressive"],
      "default": "normal"
    }
  }
}

🚀 Quick Start

Installation

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Nalzankii/kwCyber-AI-Agent"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

Basic Chat

messages = [
    {"role": "system", "content": "أنت kwCyber-AI-Agent، خبير أمن سيبراني يتحدث العربية والإنجليزية."},
    {"role": "user", "content": "ما هي أفضل طريقة لفحص شبكة؟"}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=512, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

API Usage (Hugging Face Inference)

from huggingface_hub import InferenceClient

client = InferenceClient("Nalzankii/kwCyber-AI-Agent")

response = client.chat_completion(
    messages=[
        {"role": "system", "content": "You are kwCyber-AI-Agent, a cybersecurity expert."},
        {"role": "user", "content": "Explain XSS attacks"}
    ],
    max_tokens=512
)

print(response.choices[0].message.content)

Using with vLLM (Production)

pip install vllm

python -m vllm.entrypoints.openai.api_server \
    --model Nalzankii/kwCyber-AI-Agent \
    --port 8000 \
    --max-model-len 8192

📊 Benchmarks

⏳ Benchmarks will be published after training completion.

Planned Evaluations

Benchmark	Description
CyberBench	Cybersecurity knowledge assessment
SecQA	Security question answering
CTF-Eval	Capture The Flag problem solving
Arabic-NLU	Arabic language understanding
Tool-Use Accuracy	Function calling correctness
Safety Score	Harmful content resistance

⚠️ Limitations & Ethical Use

Limitations

Model is specialized in cybersecurity; may underperform on general topics
Arabic cybersecurity terminology is still evolving; some terms may vary
Tool execution requires proper sandboxing and authorization
Not a replacement for professional security audits

Ethical Guidelines

⚠️ IMPORTANT: This model is designed for DEFENSIVE and EDUCATIONAL purposes only.

✅ DO: Use for learning, authorized testing, CTF competitions, security research
❌ DON'T: Use for unauthorized access, creating malware, attacking systems without permission
⚖️ COMPLY: Follow Kuwait Cybercrime Law No. 63/2015 and all applicable regulations
🛡️ RESPONSIBLE: Always obtain proper authorization before any security testing

Safety Measures

Built-in content filtering for harmful requests
Requires target authorization for scanning operations
Logging and audit trail for all agent actions
Rate limiting to prevent abuse

🏗️ Project Roadmap

Project planning & architecture design
Hugging Face repository setup
Data collection & processing pipeline
Custom tokenizer training
Continued pre-training
Supervised fine-tuning (SFT)
DPO alignment
Tool-use training
Multi-platform deployment
Beta testing
Public release v1.0

🤝 Contributing

We welcome contributions! Areas where help is needed:

Arabic cybersecurity content — Translations and original content
Dataset contributions — Q&A pairs, CTF writeups
Tool integrations — New security tool wrappers
Testing & feedback — Bug reports and suggestions

📬 Contact

Hugging Face: @Nalzankii
Project: Kuwait Cyber Platform

📜 License

This model is released under the Llama 4 Community License.

Attribution

Base architecture by Meta AI
Cybersecurity knowledge from open-source databases (MITRE, NVD, OWASP)
Built with ❤️ in Kuwait 🇰🇼

kwCyber-AI-Agent — Securing the digital future, one query at a time 🛡️

Made with 💚 by Nalzankii | Kuwait Cyber

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for NaifAlzanki/kwCyber-AI-Agent

Base model

meta-llama/Llama-4-Scout-17B-16E

Finetuned

meta-llama/Llama-4-Scout-17B-16E-Instruct