These are AMD GFX906-focused GGUF quantizations of Kimi-Linear-48B-A3B-Instruct.

For GFX906 users: Kimi-Linear support has been merged into the llama.cpp-gfx906 fork.

You can git clone it and compile locally with the following commands:

git clone https://github.com/iacopPBK/llama.cpp-gfx906.git
cd llama.cpp-gfx906
./SCRIPT_compile_MI50.sh  # edit ROCM_PATH if not using /opt/rocm

Full credits for the Kimi-Linear implementation goes to ymcki! See their github repo here.

Downloads last month
258
GGUF
Model size
49B params
Architecture
kimi-linear
Hardware compatibility
Log In to add your hardware

4-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Kamali-Lab/Kimi-Linear-48B-A3B-Instruct-GGUF

Quantized
(23)
this model