Tini-Cybersec-8B-A1B 🛡️🧠

Tini1.5-8B-A1B Logo

Tini-Cybersec-8B-A1B is a specialized fine-tuned model based on the LiquidAI/LFM2.5-8B-A1B architecture. It is customized to perform complex Cybersecurity tasks, security analysis, threat modeling, and vulnerability assessment, while preserving and enhancing reasoning and Chain-of-Thought (CoT) capabilities.

The model is SFT-trained using a carefully curated dataset mix of 185,002 records comprising both deep security knowledge and structured step-by-step reasoning paths.


📊 Dataset & Matrix Distribution

The SFT training data is a balanced mixture of domain-specific cybersecurity instruction datasets and general reasoning datasets (CoT), filtered to remove empty/zero-token records.

1. Dataset Components

Dataset Source Category Records Share (%) Description
AlicanKiraz0/Cybersecurity-Dataset-Heimdall-v1.1 Cybersecurity 21,257 11.49% High-quality offensive/defensive cybersecurity instructions.
AlicanKiraz0/Cybersecurity-Dataset-Fenrir-v2.1 Cybersecurity 99,870 53.98% Large-scale cybersecurity instruction dataset.
Trendyol/Trendyol-Cybersecurity-Instruction-Tuning-Dataset Cybersecurity 53,201 28.76% Tailored cybersecurity instructions and tasks.
nohurry/Opus-4.6-Reasoning-3000x-filtered Reasoning CoT 2,326 1.26% High-quality step-by-step logical reasoning.
Jackrong/DeepSeek-V4-Distill-8000x Reasoning CoT 7,716 4.17% Distilled reasoning paths from DeepSeek-V4.
Jackrong/Qwen3.5-reasoning-700x Reasoning CoT 633 0.34% Specialized logical/reasoning instructions.
Total (Filtered Superset) Combined 185,002 100%

2. Domain Composition

  • Cybersecurity Core: 94.23% (~174,328 records)
  • Pure Reasoning & Chain-of-Thought (CoT): 5.77% (~10,674 records)

📈 Dataset Token Statistics

Calculated using the LiquidAI/LFM2.5-8B-A1B tokenizer:

  • Total Records: 185,002
  • Total Tokens: 159,858,904 tokens
  • Average Token Length: 864.09 tokens per record
  • Min Token Length: 54 tokens
  • Max Token Length: 78,313 tokens

Token length distribution:

  < 1,000 tokens        : 162,036 records (87.59%) █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █
  1,000 - 10,000        : 22,632 records (12.23%)  █ █ █
  10,000 - 20,000       : 182 records (0.10%)
  20,000 - 30,000       : 77 records (0.04%)
  30,000 - 50,000       : 63 records (0.03%)
  50,000 - 100,000      : 12 records (0.01%)

🏆 Evaluation Results (CS-Eval Benchmark)

Tini-Cybersec-8B-A1B has been evaluated on the CS-Eval Benchmark (a comprehensive cybersecurity evaluation benchmark for Large Language Models) and is published on the CS-Eval Leaderboard (under submission name DungNVT-ISELAB).

The model achieved a Comprehensive Score of 76.65%, demonstrating robust capabilities across all domains of system, network, and application security:

Evaluation Domain / Category Score (%)
Comprehensive Average (Comprehensive Score) 76.65
Supply Chain Security 86.05
AI and Network Security 83.17
Infrastructure Security 78.04
English Tasks 77.40
Data Security and Privacy Protection 76.79
Chinese Task 76.60
Vulnerability Management and Penetration Testing 76.54
Access Control and Identity Management 76.44
Threat Detection and Prevention 75.28
Encryption Technology and Key Management 75.18
Security Architecture Design 75.12
Fundamentals of System Security and Software Security 74.67
Business Continuity and Emergency Response Recovery 67.33

⚙️ Training Hyperparameters (SFT)

The model was SFT-trained using Unsloth and Hugging Face Trainer with sequence packing to optimize throughput:

Parameter Configuration Value Detail / Notes
Base Model LiquidAI/LFM2.5-8B-A1B Liquid Foundation Model
Max Sequence Length 8,192 With packing (blocks of 8,192 tokens)
Data Precision bfloat16 (BF16) Native training precision
LoRA Rank (r) 64 Broad PEFT adapter matrices
LoRA Alpha 128 Scaling factor
LoRA Targets q_proj, k_proj, v_proj, out_proj, in_proj, w1, w2, w3 Attention & LIV projections
Batch Size per Device 1 Sequence packed
Gradient Accumulation 32 Effective batch size of 32 blocks (262,144 tokens)
Learning Rate 5e-5 Recommended sweet spot for wide LoRA SFT
Learning Rate Scheduler cosine Cosine annealing for smooth convergence
Warmup Steps 10% of total steps Linear warmup
Optimizer adamw_8bit Memory efficient 8-bit AdamW
Weight Decay 0.01 Regularization
Max Gradient Norm 1.0 Gradient clipping

💬 Prompt Format & Templates

This model follows the ChatML format and supports nested <think> tags for reasoning models.

Template Structure:

<|im_start|>system
You are a helpful and knowledgeable cybersecurity expert assistant. You answer all user queries step by step with reasoning.<|im_end|>
<|im_start|>user
[Your cybersecurity query / task here]<|im_end|>
<|im_start|>assistant
<think>
[Step-by-step thinking process / Chain-of-Thought (CoT)]
</think>
[Detailed response / action plan / explanation]
<|im_end|>

🚀 How to Load and Use

To load this model with Hugging Face's transformers library:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "./Tini-Cybersec-8B-A1B_26062026"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Inference example
messages = [
    {"role": "system", "content": "You are a cybersecurity expert assistant."},
    {"role": "user", "content": "What is SQL Injection? And how to prevent it?"}
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=2048,
    temperature=0.6,
    top_p=0.9,
    do_sample=True
)

response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(response)

📄 License & Attribution

  • Base Model: Licensed under the Apache-2.0 license by LiquidAI.
  • Fine-tuned Weights: Apache-2.0 License.
  • Dataset Attribution: Please credit the original authors of AlicanKiraz0/Cybersecurity-Dataset-Heimdall-v1.1, AlicanKiraz0/Cybersecurity-Dataset-Fenrir-v2.1, Trendyol/Trendyol-Cybersecurity-Instruction-Tuning-Dataset, nohurry/Opus-4.6-Reasoning-3000x-filtered, Jackrong/DeepSeek-V4-Distill-8000x, and Jackrong/Qwen3.5-reasoning-700x.
Downloads last month
-
Safetensors
Model size
8B params
Tensor type
F32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for iselabvn/Tini-Cybersec-8B-A1B

Finetuned
(30)
this model

Datasets used to train iselabvn/Tini-Cybersec-8B-A1B