--- license: apache-2.0 ---
High-performance quantized GGUF builds of Cohere's North Code model.
Optimized for local inference via llama.cpp, LM Studio, and Ollama.
Search for "North Code Quant" in the LM Studio search bar, select your preferred quantization level from the sidebar, and click Download.
./llama-cli -m north-code-quant-Q4_K_M.gguf \
--ctx-size 8192 \
--threads $(nproc) \
--prompt "def fibonacci(n):"
Files are sorted by size and quality. Q4_K_M is recommended for most users as the best balance of speed and perplexity.
| File Name | Quant Type | Size | Description |
|---|---|---|---|
North-Code-Quant.gguf |
Q8_0 | -- GB | Near-lossless. Best quality, higher VRAM/RAM requirement. |
These GGUF files were converted from the official
Cohere North Code
weights using llama.cpp with importance matrix calibration for optimal token-level precision retention.