Instructions to use rajofearth/lfm-ucf-unsloth with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Local Apps
- Unsloth Studio new
How to use rajofearth/lfm-ucf-unsloth with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for rajofearth/lfm-ucf-unsloth to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for rajofearth/lfm-ucf-unsloth to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for rajofearth/lfm-ucf-unsloth to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="rajofearth/lfm-ucf-unsloth", max_seq_length=2048, )
LFM2.5-VL-1.6B UCF Crime — LoRA Adapters
Base model: LiquidAI/LFM2.5-VL-1.6B fine-tuned on the UCF Crime dataset for surveillance crime detection.
Fine-tuned 2× faster with Unsloth on ~26k surveillance images — entirely on a free Google Colab T4 GPU.
🔓 Training notebook (free Colab): Open in Colab — reproduce this fine-tune yourself, no paid GPU needed.
About this model
This repo contains LoRA adapter weights for LFM2.5-VL-1.6B, trained to analyze CCTV/surveillance images and detect harmful or criminal activity across 15 UCF Crime categories:
Abuse · Arrest · Arson · Assault · Burglary · Explosion · Fighting · Robbery · Shooting · Shoplifting · Stealing · Vandalism · Road Accident · Normal
The model targets a structured JSON output:
{
"isHarm": true,
"descriptionIfHarm": "The image depicts fighting."
}
When no harmful activity is detected:
{
"isHarm": false
}
⚠️ Output format note: The model was trained toward JSON output but does not produce it by default. It works best when you explicitly instruct it to respond in JSON via the system prompt. The real gain is improved domain understanding of UCF Crime CCTV imagery — see evaluation below.
Training Details
| Parameter | Value |
|---|---|
| Hardware | Google Colab T4 (free tier) |
| Training time | ~10 hours |
| Eval time | ~3 hours |
| Dataset split | 80/20 |
| Training samples | ~26,000 images |
| Eval samples | ~5,600 images |
| Epochs | 1 |
| LoRA rank (r) | 16 |
| LoRA alpha | 16 |
| Learning rate | 2e-4 |
| Batch size | 2 (grad accum: 4) |
| Optimizer | adamw_8bit |
| Max seq length | 2048 |
| Vision layers | Frozen |
| Language layers | Fine-tuned |
The full UCF Crime dataset has ~600k images (1,900+ CCTV videos). Training on the full set would take weeks on a free T4, so a balanced subset of 26k images (1,000 per crime class + equal normal samples) was used.
Evaluation (5,200 samples)
Evaluated against the base model using an LLM judge on a held-out test set.
| Model | Accuracy |
|---|---|
| Base model (LFM2.5-VL-1.6B, untrained) | 35.2% |
| LoRA fine-tuned (this model) | 44.8% |
The fine-tuned model achieves a +9.6 percentage point improvement over the base model on UCF Crime CCTV imagery, demonstrating that even a small 26k subset meaningfully improves domain understanding — all from a free Colab session.
Usage
Load with Unsloth (recommended)
from unsloth import FastVisionModel
from PIL import Image
model, tokenizer = FastVisionModel.from_pretrained(
model_name="rajofearth/lfm-ucf-unsloth",
max_seq_length=2048,
load_in_4bit=False,
attn_implementation="eager",
)
FastVisionModel.for_inference(model)
image = Image.open("your_surveillance_image.jpg").convert("RGB")
system_prompt = """Analyze this frame with extreme caution. Detect ANY potential harm.
If ANY doubt, flag as harmful. Reply ONLY in strict JSON:
{"isHarm": true/false, "description": "brief exact reason only if true, else null"}. No explanations outside JSON."""
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": system_prompt},
{"type": "image", "image": image}
]
}
]
prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(text=prompt, images=image, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=128, temperature=0.2)
input_len = inputs["input_ids"].shape[1]
print(tokenizer.decode(outputs[0][input_len:], skip_special_tokens=True))
Tip: Always include the system prompt instructing JSON output — the model is trained for it but won't default to it without being told.
Load with Transformers + PEFT
from transformers import AutoProcessor, AutoModelForVision2Seq
from peft import PeftModel
base = AutoModelForVision2Seq.from_pretrained("LiquidAI/LFM2.5-VL-1.6B")
model = PeftModel.from_pretrained(base, "rajofearth/lfm-ucf-unsloth")
processor = AutoProcessor.from_pretrained("LiquidAI/LFM2.5-VL-1.6B")
Reproduce This Fine-Tune
Want to train this yourself or adapt it to your own surveillance dataset? The full training notebook is free and public:
The notebook covers:
- Setting up Unsloth + LFM2.5-VL-1.6B on a free T4
- Loading and preprocessing the UCF Crime dataset
- LoRA fine-tuning with vision layers frozen
- Evaluation using an LLM judge
- Exporting to GGUF for llama.cpp / Ollama
Related
| Resource | Link |
|---|---|
| Base model | LiquidAI/LFM2.5-VL-1.6B |
| GGUF version | rajofearth/lfm-ucf-gguf |
| Training notebook | Google Colab |
| Dataset | tanzzpatil/ucf-crime-small |
Developed by: rajofearth · Created with Unsloth + Google Colab (free tier).
