Ethical Hacking LLM Fine-Tuning on Free Kaggle/Colab

✅ Updated notebooks

The notebooks keep Unsloth for low VRAM where applicable and the no-Unsloth fallback has all dataset options built in directly.

Main notebooks

Notebook	Model	Best use
`EthicalHacking_LFM2.5_Ultimate_Colab.ipynb`	`unsloth/LFM2.5-1.2B-Instruct`	Recommended first, fastest and lowest VRAM
`EthicalHacking_Qwen3-4B_Ultimate_Colab.ipynb`	`unsloth/Qwen3-4B-Instruct-2507-unsloth-bnb-4bit`	Stronger model, tighter T4 VRAM
`EthicalHacking_Gemma4_E2B_Colab.ipynb`	`unsloth/gemma-4-E2B-it-unsloth-bnb-4bit`	Experimental; use LFM2.5 first
`EthicalHacking_Stable_QLoRA_ManualLoop_NO_UNSLOTH.ipynb`	Vanilla PEFT QLoRA	Backup; all dataset options directly included

Inference / chat after training

Standalone inference resources:

File	Purpose
`INFERENCE_CELL.md`	Copy-paste final notebook cell for Kaggle/Colab
`inference_adapter_chat.py`	Standalone Python chat script

Example adapter paths after training:

# No-Unsloth notebook default
BASE_MODEL_ID = "LiquidAI/LFM2.5-1.2B-Instruct"
ADAPTER_PATH = "./lfm25-stable-qlora-cybersecurity-adapter"

# Unsloth LFM2.5 notebook
BASE_MODEL_ID = "unsloth/LFM2.5-1.2B-Instruct"
ADAPTER_PATH = "./lfm25-lora-adapter"

# Unsloth Qwen3 notebook
BASE_MODEL_ID = "unsloth/Qwen3-4B-Instruct-2507-unsloth-bnb-4bit"
ADAPTER_PATH = "./qwen3-lora-adapter"

# Unsloth Gemma notebook
BASE_MODEL_ID = "unsloth/gemma-4-E2B-it-unsloth-bnb-4bit"
ADAPTER_PATH = "./gemma4-lora-adapter"

If you pushed the adapter to Hub:

ADAPTER_PATH = "your-username/your-adapter-repo"

Dataset choices in the no-Unsloth notebook

Inside the config cell, set:

DATASET_CHOICE = "cybersecurity"  # Fenrir + Trendyol
DATASET_CHOICE = "code_corpus"    # krystv/code-corpus-llm-training
DATASET_CHOICE = "ultrachat"      # HuggingFaceH4/ultrachat_200k
DATASET_CHOICE = "openhermes"     # teknium/OpenHermes-2.5
DATASET_CHOICE = "sharegpt_en"    # deepmage121/ShareGPT_multilingual English
DATASET_CHOICE = "sharegpt_de"    # German translated ShareGPT
DATASET_CHOICE = "sharegpt_hi"    # Hindi translated ShareGPT
DATASET_CHOICE = "custom_mix"     # mix multiple sources

For code_corpus, recommended starting settings:

DATASET_CHOICE = "code_corpus"
MAX_SEQ_LENGTH = 2048
SAMPLE_SIZE = 10000
MAX_STEPS = 500

Dataset guide

DATASET_OPTIONS.md

Important Unsloth run instructions

For the Unsloth notebooks:

Run cell 1.
It installs/updates Unsloth and intentionally restarts the kernel once.
After restart, run all cells from the top again.
Do not skip cell 1.

Fast smoke test

Before full training:

SAMPLE_SIZE = 1000
MAX_STEPS = 10

If that trains, increase settings.

Safety

Use only for ethical, defensive, authorized cybersecurity education and research.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support