This model repository contains files in GGUF format for the Yi 34B LLaMA, compatible with LLaMA modeling, based on the work from the chargoddard/Yi-34B-Llama repository.

Based of the work of chargoddard's:

Tensors have been renamed to match the standard LLaMA.
Model can be loaded without trust_remote_code, but the tokenizer can not.

Converted & Quantized Files

Yi-34B-Llamafied Model Options

The following tables list the available Yi-34B-Llamafied model files with their respective quantization methods and characteristics.

Key:

Size: File size relative to the original.
Quality Loss: The amount of quality loss due to quantization.

Q-Method	File Name	Size	Quality Loss	Recommended
Q2	Yi-34B-Llama_Q2_K	Smallest	Extreme (not recommended)
Q3	Yi-34B-Llama_Q3_K_S	Very Small	Very High
Q3	Yi-34B-Llama_Q3_K_M	Very Small	Very High
Q3	Yi-34B-Llama_Q3_K_L	Small	Substantial
Q4	Yi-34B-Llama_Q4_K_S	Small	Significant
Q4	Yi-34B-Llama_Q4_K_M	Medium	Balanced	Recommended
Q5	Yi-34B-Llama_Q5_K_S	Large	Low	Recommended
Q5	Yi-34B-Llama_Q5_K_M	Large	Very Low	Recommended
Q6	Yi-34B-Llama_Q6_K	Very Large	Extremely Low
Q8	Yi-34B-Llama_Q8_0	Very Large	Extremely Low (not recommended)

Please choose the model that best suits your needs based on the size and quality loss trade-offs.

Downloads last month: 7

GGUF

Model size

34B params

Architecture

llama

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit