banner

Carnice-9b-GGUF

GGUF builds of Carnice-9b, a Hermes-Agent-specialized model built from Qwen/Qwen3.5-9B and trained specifically for the Hermes-Agent harness.

This repo contains three quantized variants:

  • Carnice-9b-Q4_K_M.gguf
  • Carnice-9b-Q6_K.gguf
  • Carnice-9b-Q8_0.gguf

Quantizations

File Quant Size Recommended use
Carnice-9b-Q4_K_M.gguf 4-bit 5.3 GB fastest local testing
Carnice-9b-Q6_K.gguf 6-bit 6.9 GB best quality/size balance
Carnice-9b-Q8_0.gguf 8-bit 8.9 GB highest quality GGUF option

Source model

Merged source model:

What it was trained for

Carnice-9b was trained specifically around Hermes-Agent behavior rather than generic chat polish. The training mixture emphasized:

  • Hermes-native terminal/file/browser trajectories
  • tool-oriented multi-turn agent behavior
  • reasoning-repair data to recover general reasoning after the first Hermes-specific tuning pass
  • a second Hermes refresh stage to pull the model back toward harness-native action formatting and tool usage

llama.cpp

llama-cli -m Carnice-9b-Q6_K.gguf -p "Reply with exactly READY." -n 16

Notes

These are GGUF exports of the merged standalone Carnice model, not PEFT adapters.

Downloads last month
996
GGUF
Model size
9B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kai-os/Carnice-9b-GGUF

Finetuned
Qwen/Qwen3.5-9B
Quantized
(1)
this model

Collection including kai-os/Carnice-9b-GGUF