๐Ÿ›ก๏ธ kwCyber-AI-Agent

The First Arabic-English Cybersecurity AI Agent

License Language Domain Platform

Built by Nalzankii ๐Ÿ‡ฐ๐Ÿ‡ผ

Specialized AI Agent for Cybersecurity Education, Penetration Testing Guidance, and Threat Analysis


๐ŸŽฏ Model Overview

kwCyber-AI-Agent is a purpose-built cybersecurity AI agent designed to serve as an intelligent assistant for security professionals, students, and enthusiasts in the Arab world and beyond. Unlike general-purpose LLMs, this model is exclusively trained on cybersecurity knowledge, making it a domain expert.

Key Highlights

  • ๐ŸŒ Bilingual: Native Arabic and English cybersecurity expertise
  • ๐Ÿ”ง Tool-Use: Can execute security tools (Nmap, Nikto, CVE lookup, etc.)
  • ๐ŸŽ“ Educational: Personalized cybersecurity learning paths
  • ๐Ÿด CTF Expert: Challenge walkthroughs and guidance
  • ๐Ÿ” Threat Analysis: Real-time vulnerability assessment
  • ๐Ÿ‡ฐ๐Ÿ‡ผ Made in Kuwait: Built for the Kuwait Cyber platform

๐Ÿง  Model Details

Attribute Details
Developer Nalzankii (Kuwait Cyber)
Base Architecture Llama 4 (Transformer)
Parameters ~7B
Context Length 8,192 tokens
Languages Arabic (ar), English (en)
License Llama 4 Community License
Domain Cybersecurity
Training Stage ๐Ÿšง In Development

Architecture

kwCyber-AI-Agent
โ”œโ”€โ”€ Base: Llama 4 Architecture
โ”œโ”€โ”€ Custom Tokenizer (64K vocab, Arabic-optimized)
โ”œโ”€โ”€ Fine-tuned on Cybersecurity Corpus
โ”œโ”€โ”€ Tool-Use Layer (Function Calling)
โ”œโ”€โ”€ Safety Guardrails (Content Filtering)
โ””โ”€โ”€ RAG Integration (Knowledge Base)

๐ŸŽ“ Intended Uses

Primary Use Cases

  1. Cybersecurity Q&A โ€” Ask security questions in Arabic or English
  2. Vulnerability Analysis โ€” Analyze CVEs, understand attack vectors
  3. CTF Assistance โ€” Get hints and walkthroughs for CTF challenges
  4. Learning Mentor โ€” Personalized study plans for security certifications
  5. Tool Guidance โ€” Learn how to use security tools effectively
  6. Threat Intelligence โ€” Stay updated on latest threats and vulnerabilities

Example Interactions

Arabic:

User: ู…ุง ู‡ูˆ ู‡ุฌูˆู… SQL Injection ูˆูƒูŠู ุฃุญู…ูŠ ุชุทุจูŠู‚ูŠ ู…ู†ู‡ุŸ

kwCyber-AI-Agent: ู‡ุฌูˆู… SQL Injection ู‡ูˆ ู†ูˆุน ู…ู† ุงู„ู‡ุฌู…ุงุช ูŠุณุชุบู„ ุซุบุฑุงุช ููŠ ุงู„ุชุทุจูŠู‚ุงุช 
ุงู„ุชูŠ ุชุชุนุงู…ู„ ู…ุน ู‚ูˆุงุนุฏ ุงู„ุจูŠุงู†ุงุช. ุงู„ู…ู‡ุงุฌู… ูŠู‚ูˆู… ุจุญู‚ู† ุฃูˆุงู…ุฑ SQL ุฎุจูŠุซุฉ ู…ู† ุฎู„ุงู„ ุญู‚ูˆู„ 
ุงู„ุฅุฏุฎุงู„...

ู„ู„ุญู…ุงูŠุฉ:
1. ุงุณุชุฎุฏู… Prepared Statements (ุงุณุชุนู„ุงู…ุงุช ู…ุญุถู‘ุฑุฉ)
2. ูุนู‘ู„ Input Validation
3. ุทุจู‘ู‚ ู…ุจุฏุฃ Least Privilege ุนู„ู‰ ู‚ุงุนุฏุฉ ุงู„ุจูŠุงู†ุงุช
4. ุงุณุชุฎุฏู… WAF (Web Application Firewall)

English:

User: How do I perform a basic Nmap scan on a target?

kwCyber-AI-Agent: Here's how to use Nmap for reconnaissance:

Basic scan: nmap -sV -sC target_ip
- -sV: Version detection
- -sC: Default scripts

โš ๏ธ Important: Only scan targets you have explicit authorization to test.

Tool Use:

User: Scan example.com for open ports

kwCyber-AI-Agent: [Calling: scan_target]
{
  "target": "example.com",
  "scan_type": "port",
  "intensity": "normal"
}

Results: Found 3 open ports...

๐Ÿ‹๏ธ Training

Training Stages

Stage Description Data Size Status
Stage 1 Continued Pre-training on Cybersecurity Corpus 100GB+ text ๐Ÿ”œ Planned
Stage 2 Supervised Fine-Tuning (SFT) 200K+ Q&A pairs ๐Ÿ”œ Planned
Stage 3 DPO Alignment 10K+ preference pairs ๐Ÿ”œ Planned
Stage 4 Tool-Use Training 20K+ function calls ๐Ÿ”œ Planned

Training Data Sources

  • MITRE ATT&CK โ€” Adversary tactics and techniques
  • NVD/CVE Database โ€” Vulnerability records
  • OWASP โ€” Web application security
  • CWE โ€” Common weakness patterns
  • CTF Writeups โ€” HackTheBox, TryHackMe, PicoCTF
  • Security Research Papers โ€” IEEE, ACM, arXiv
  • Custom Arabic Dataset โ€” Translated and original Arabic security content

Training Configuration

training:
  base_model: meta-llama/Llama-4-Scout-17B-16E-Instruct
  method: QLoRA
  lora_r: 64
  lora_alpha: 128
  learning_rate: 2e-5
  batch_size: 16
  gradient_accumulation: 4
  epochs: 3
  warmup_ratio: 0.1
  optimizer: adamw_torch
  scheduler: cosine
  max_seq_length: 8192
  precision: bf16

๐Ÿ”ง Agent Capabilities

Supported Tools

Tool Capability Integration
๐Ÿ” Nmap Port scanning & service detection CLI Wrapper
๐ŸŒ Nikto Web vulnerability scanning CLI Wrapper
๐Ÿ—„๏ธ SQLmap SQL injection testing Sandboxed
๐Ÿฆ  VirusTotal Malware analysis REST API
๐ŸŒ Shodan Internet-wide scanning REST API
๐Ÿ“‹ CVE API Vulnerability lookup REST API
๐Ÿ”Ž WHOIS Domain information Python Lib
๐Ÿ“ก DNS DNS reconnaissance Python Lib

Function Calling Format

{
  "name": "scan_target",
  "description": "Perform a security scan on target",
  "parameters": {
    "target": {
      "type": "string",
      "description": "IP address or domain"
    },
    "scan_type": {
      "type": "string",
      "enum": ["port", "vuln", "web", "dns", "full"]
    },
    "intensity": {
      "type": "string",
      "enum": ["light", "normal", "aggressive"],
      "default": "normal"
    }
  }
}

๐Ÿš€ Quick Start

Installation

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Nalzankii/kwCyber-AI-Agent"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

Basic Chat

messages = [
    {"role": "system", "content": "ุฃู†ุช kwCyber-AI-AgentุŒ ุฎุจูŠุฑ ุฃู…ู† ุณูŠุจุฑุงู†ูŠ ูŠุชุญุฏุซ ุงู„ุนุฑุจูŠุฉ ูˆุงู„ุฅู†ุฌู„ูŠุฒูŠุฉ."},
    {"role": "user", "content": "ู…ุง ู‡ูŠ ุฃูุถู„ ุทุฑูŠู‚ุฉ ู„ูุญุต ุดุจูƒุฉุŸ"}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=512, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

API Usage (Hugging Face Inference)

from huggingface_hub import InferenceClient

client = InferenceClient("Nalzankii/kwCyber-AI-Agent")

response = client.chat_completion(
    messages=[
        {"role": "system", "content": "You are kwCyber-AI-Agent, a cybersecurity expert."},
        {"role": "user", "content": "Explain XSS attacks"}
    ],
    max_tokens=512
)

print(response.choices[0].message.content)

Using with vLLM (Production)

pip install vllm

python -m vllm.entrypoints.openai.api_server \
    --model Nalzankii/kwCyber-AI-Agent \
    --port 8000 \
    --max-model-len 8192

๐Ÿ“Š Benchmarks

โณ Benchmarks will be published after training completion.

Planned Evaluations

Benchmark Description
CyberBench Cybersecurity knowledge assessment
SecQA Security question answering
CTF-Eval Capture The Flag problem solving
Arabic-NLU Arabic language understanding
Tool-Use Accuracy Function calling correctness
Safety Score Harmful content resistance

โš ๏ธ Limitations & Ethical Use

Limitations

  • Model is specialized in cybersecurity; may underperform on general topics
  • Arabic cybersecurity terminology is still evolving; some terms may vary
  • Tool execution requires proper sandboxing and authorization
  • Not a replacement for professional security audits

Ethical Guidelines

โš ๏ธ IMPORTANT: This model is designed for DEFENSIVE and EDUCATIONAL purposes only.

  • โœ… DO: Use for learning, authorized testing, CTF competitions, security research
  • โŒ DON'T: Use for unauthorized access, creating malware, attacking systems without permission
  • โš–๏ธ COMPLY: Follow Kuwait Cybercrime Law No. 63/2015 and all applicable regulations
  • ๐Ÿ›ก๏ธ RESPONSIBLE: Always obtain proper authorization before any security testing

Safety Measures

  • Built-in content filtering for harmful requests
  • Requires target authorization for scanning operations
  • Logging and audit trail for all agent actions
  • Rate limiting to prevent abuse

๐Ÿ—๏ธ Project Roadmap

  • Project planning & architecture design
  • Hugging Face repository setup
  • Data collection & processing pipeline
  • Custom tokenizer training
  • Continued pre-training
  • Supervised fine-tuning (SFT)
  • DPO alignment
  • Tool-use training
  • Multi-platform deployment
  • Beta testing
  • Public release v1.0

๐Ÿค Contributing

We welcome contributions! Areas where help is needed:

  • Arabic cybersecurity content โ€” Translations and original content
  • Dataset contributions โ€” Q&A pairs, CTF writeups
  • Tool integrations โ€” New security tool wrappers
  • Testing & feedback โ€” Bug reports and suggestions

๐Ÿ“ฌ Contact


๐Ÿ“œ License

This model is released under the Llama 4 Community License.

Attribution

  • Base architecture by Meta AI
  • Cybersecurity knowledge from open-source databases (MITRE, NVD, OWASP)
  • Built with โค๏ธ in Kuwait ๐Ÿ‡ฐ๐Ÿ‡ผ

kwCyber-AI-Agent โ€” Securing the digital future, one query at a time ๐Ÿ›ก๏ธ

Made with ๐Ÿ’š by Nalzankii | Kuwait Cyber

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for NaifAlzanki/kwCyber-AI-Agent

Dataset used to train NaifAlzanki/kwCyber-AI-Agent