Running Qwen3-VL-2B-Instruct on real security camera feeds β impressive results at IQ2 quantization
Sharing some real-world results from running this model on live security camera footage via SharpAI Aegis + llama-server.
Setup: UD-IQ2_M quantization (0.7 GB) + mmproj-F16 (781 MB) on MacBook Air M3 24GB.
Input: A Blink battery camera mount at front door.
Output: "A mailman is delivering mail to a suburban house. The mailman is wearing a blue uniform and carrying a white mail bag. The house is white with a brown roof, and there's a driveway with a black car parked in front. The mailman is walking on a brick path surrounded by green bushes and trees."
For a 2B model at aggressive quantization, the scene comprehension is remarkably detailed β it correctly identifies the person's role (mailman), clothing, objects, the environment, and spatial relationships.
This is being used in a real product for continuous security camera analysis. The model runs comfortably on a Mac Mini with 8 GB RAM alongside other system tasks.
Great work on the GGUF conversion β the Unsloth chat template fixes are appreciated!
App: https://www.sharpai.org (free, Mac/Windows/Linux)