YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Ethical Hacking LLM Fine-Tuning on Free Kaggle/Colab

βœ… Updated notebooks

The notebooks keep Unsloth for low VRAM where applicable and the no-Unsloth fallback has all dataset options built in directly.

Main notebooks

Notebook Model Best use
EthicalHacking_LFM2.5_Ultimate_Colab.ipynb unsloth/LFM2.5-1.2B-Instruct Recommended first, fastest and lowest VRAM
EthicalHacking_Qwen3-4B_Ultimate_Colab.ipynb unsloth/Qwen3-4B-Instruct-2507-unsloth-bnb-4bit Stronger model, tighter T4 VRAM
EthicalHacking_Gemma4_E2B_Colab.ipynb unsloth/gemma-4-E2B-it-unsloth-bnb-4bit Experimental; use LFM2.5 first
EthicalHacking_Stable_QLoRA_ManualLoop_NO_UNSLOTH.ipynb Vanilla PEFT QLoRA Backup; all dataset options directly included

Inference / chat after training

Standalone inference resources:

File Purpose
INFERENCE_CELL.md Copy-paste final notebook cell for Kaggle/Colab
inference_adapter_chat.py Standalone Python chat script

Example adapter paths after training:

# No-Unsloth notebook default
BASE_MODEL_ID = "LiquidAI/LFM2.5-1.2B-Instruct"
ADAPTER_PATH = "./lfm25-stable-qlora-cybersecurity-adapter"

# Unsloth LFM2.5 notebook
BASE_MODEL_ID = "unsloth/LFM2.5-1.2B-Instruct"
ADAPTER_PATH = "./lfm25-lora-adapter"

# Unsloth Qwen3 notebook
BASE_MODEL_ID = "unsloth/Qwen3-4B-Instruct-2507-unsloth-bnb-4bit"
ADAPTER_PATH = "./qwen3-lora-adapter"

# Unsloth Gemma notebook
BASE_MODEL_ID = "unsloth/gemma-4-E2B-it-unsloth-bnb-4bit"
ADAPTER_PATH = "./gemma4-lora-adapter"

If you pushed the adapter to Hub:

ADAPTER_PATH = "your-username/your-adapter-repo"

Dataset choices in the no-Unsloth notebook

Inside the config cell, set:

DATASET_CHOICE = "cybersecurity"  # Fenrir + Trendyol
DATASET_CHOICE = "code_corpus"    # krystv/code-corpus-llm-training
DATASET_CHOICE = "ultrachat"      # HuggingFaceH4/ultrachat_200k
DATASET_CHOICE = "openhermes"     # teknium/OpenHermes-2.5
DATASET_CHOICE = "sharegpt_en"    # deepmage121/ShareGPT_multilingual English
DATASET_CHOICE = "sharegpt_de"    # German translated ShareGPT
DATASET_CHOICE = "sharegpt_hi"    # Hindi translated ShareGPT
DATASET_CHOICE = "custom_mix"     # mix multiple sources

For code_corpus, recommended starting settings:

DATASET_CHOICE = "code_corpus"
MAX_SEQ_LENGTH = 2048
SAMPLE_SIZE = 10000
MAX_STEPS = 500

Dataset guide

DATASET_OPTIONS.md

Important Unsloth run instructions

For the Unsloth notebooks:

  1. Run cell 1.
  2. It installs/updates Unsloth and intentionally restarts the kernel once.
  3. After restart, run all cells from the top again.
  4. Do not skip cell 1.

Fast smoke test

Before full training:

SAMPLE_SIZE = 1000
MAX_STEPS = 10

If that trains, increase settings.

Safety

Use only for ethical, defensive, authorized cybersecurity education and research.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support