GGUF importance matrix (imatrix) quants for https://huggingface.co/NousResearch/Nous-Capybara-34B
The importance matrix was trained for 100K tokens (200 batches of 512 tokens) using wiki.train.raw.

Although this model is quite good, it is very sensitive with its prompt template - no space at end after ASSISTANT:

Layers Context Template
60
200000
USER: {prompt}
ASSISTANT:{response}
Downloads last month
2
GGUF
Model size
34B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

3-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support