LocoTrainer-4B GGUF

This repository contains GGUF format quantizations of LocoTrainer-4B, a distilled reasoning model based on the Qwen architecture.

Model Description

Available Quants

File Size Description
LocoTrainer-4B-Q4_K_M.gguf ~2.5 GB 4-bit Medium. Best balance of speed and quality for general use.
LocoTrainer-4B-Q5_K_M.gguf ~2.8 GB 5-bit Medium. High quality with a slightly larger footprint.
LocoTrainer-4B-Q6_K.gguf ~3.2 GB 6-bit. Near-lossless quantization.
LocoTrainer-4B-Q8_0.gguf ~4.2 GB 8-bit. Maximum fidelity, equivalent to original weights.

Usage

llama.cpp

You can run these models using the llama-cli:

./llama-cli -m LocoTrainer-4B-Q4_K_M.gguf -p "What is the capital of France?" -n 128
Downloads last month
86
GGUF
Model size
4B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Abhiray/LocoTrainer-4B-GGUF

Quantized
(3)
this model