Senior Project Notice

This repository was created for a senior project in ENGT 375 Applied Machine Learning at Old Dominion University. It is provided for educational and research demonstration purposes only. It is not intended for production use, security filtering, or making real-world spam/phishing decisions. Always use established security tools for operational email protection.


library_name: transformers tags: - spam-detection - liquid-ai - lora - peft - apple-silicon - nlp - text-classification license: mit base_model: LiquidAI/LFM2.5-1.2B-Instruct datasets: - VoltageVagabond/spam-email-dataset pipeline_tag: text-generation

Spam Classifier — Liquid AI LFM2.5-1.2B LoRA Fine-Tune

ENGT 375 — Applied Machine Learning | Spring 2026 | ODU Liquid AI's LFM2.5-1.2B-Instruct model fine-tuned with LoRA adapters using HuggingFace Transformers + PEFT for spam email classification.

Model Details

  • Base model: LiquidAI/LFM2.5-1.2B-Instruct
  • Fine-tuning: LoRA (rank 8, alpha 16, dropout 0.1)
  • Framework: HuggingFace Transformers + PEFT + TRL
  • Hardware: Apple Silicon (M-series)
  • Task: Classify emails as spam or ham

LoRA Target Modules

w1, w2, in_proj, out_proj, v_proj, k_proj, q_proj, w3

Training Details

Hyperparameter Value
Training examples ~8,000 (fast) / ~16,000 (full) — 3-class Spam/Ham/Phishing
Test examples ~20% holdout from the retrain split
Epochs 3
Batch size 1 (effective 4 with gradient accumulation steps = 4)
Learning rate 2e-4
Max sequence length 256
Optimizer adamw_torch (bitsandbytes 8-bit not supported on MPS)
Weight dtype bfloat16
Device MPS (Apple Silicon)
Gradient checkpointing Enabled (use_reentrant=False)
Max gradient norm 0.3
LoRA rank 8
LoRA alpha 16
LoRA dropout 0.1
Target modules 8 (q_proj, k_proj, v_proj, out_proj, w1, w2, w3, in_proj)
Training time ~1–1.5 hours (per fine_tune.py; earlier docs listed ~2–2.5 hours before the v0.4.3 memory optimization)

Hardware

  • Device: Apple Silicon (M-series)
  • Backend: PyTorch MPS (Metal Performance Shaders)

Dataset

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained("LiquidAI/LFM2.5-1.2B-Instruct")
model = PeftModel.from_pretrained(base_model, "adapters")
tokenizer = AutoTokenizer.from_pretrained("LiquidAI/LFM2.5-1.2B-Instruct")

Gradio Interface

pip install -r requirements.txt
python app.py

Files

  • adapters/ — LoRA adapter weights + config
  • fine_tune.py — Training script
  • app.py — Gradio web interface
  • training_data/ — Training dataset

Intended Use

This model is an educational demonstration of LLM fine-tuning with HuggingFace PEFT, created as part of a university course project. It is suitable for:

  • Learning how LoRA fine-tuning works with the HuggingFace ecosystem (Transformers + PEFT + TRL)
  • Exploring Liquid AI's novel architecture for text classification
  • Comparing different LLM fine-tuning frameworks (MLX vs. HuggingFace)

It is not intended for production spam filtering.

Limitations

  • May misclassify legitimate marketing emails as spam
  • Trained on English emails only — not suitable for other languages
  • Training set (~8K fast / ~16K full) is modest compared to production spam filters — generalization may be limited

Note: Three-class classification (SPAM / HAM / PHISHING) is supported as of v0.4.0 — earlier versions were binary. The model is deployed as a HuggingFace Space (see Space header above).

Related Models

Model Description Link
spam-classifier-mlx Qwen 3.5 0.8B MLX LoRA fine-tune VoltageVagabond/spam-classifier-mlx
spam-xai-model sklearn voting ensemble (RF + LR + SVM) with LIME/SHAP/ELI5 explainability VoltageVagabond/spam-xai-model
spam-xai-classifier (Space) Live Gradio web app for the sklearn classifier VoltageVagabond/spam-xai-classifier

Citation

@misc{voltagevagabond2026spamliquid,
  title={Spam Classifier — Liquid AI LFM2.5-1.2B LoRA Fine-Tune},
  author={VoltageVagabond},
  year={2026},
  howpublished={\url{https://huggingface.co/VoltageVagabond/spam-classifier-liquid}},
  note={ENGT 375 — Applied Machine Learning, Old Dominion University, Spring 2026}
}
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for VoltageVagabond/spam-classifier-liquid

Adapters
1 model

Dataset used to train VoltageVagabond/spam-classifier-liquid

Spaces using VoltageVagabond/spam-classifier-liquid 2