smirki/tessa-coder-9b-sft-v0-GGUF

GGUF quantizations of smirki/tessa-coder-9b-sft-v0.

Converted using the latest llama.cpp (with Qwen 3.5 support).

Available Quantizations

Quant	Use Case
Q2_K	Extreme compression
Q3_K_S / Q3_K_M / Q3_K_L	Small footprint
Q4_0 / Q4_K_S / Q4_K_M	Recommended
Q5_0 / Q5_K_S / Q5_K_M	High quality
Q6_K	Near-lossless
Q8_0	Highest quality quant
bf16	Full precision

Usage

brew install llama.cpp
llama-cli --hf-repo smirki/tessa-coder-9b-sft-v0-GGUF --hf-file tessa-coder-9b-sft-v0-q4_k_m.gguf -p "Your prompt here"
llama-server --hf-repo smirki/tessa-coder-9b-sft-v0-GGUF --hf-file tessa-coder-9b-sft-v0-q4_k_m.gguf -c 8192

Downloads last month: 80

GGUF

Model size

9B params

Architecture

qwen35

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for smirki/tessa-coder-9b-sft-v0-GGUF

Base model

smirki/tessa-coder-9b-sft-v0

Quantized

(2)

this model