GGUF
Korean
English
imatrix
conversational
How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="hell0ks/MiMo-V2.5-GGUF-ko",
	filename="",
)
llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

MiMo-V2.5-GGUF-ko

This is basically same as AesSedai/MiMo-V2.5-GGUF, but with Korean-focused imatrix dataset.

How it was made

  • Generate Korean-focused imatrix calibration dataset (Total 2048 samples, Chat template applied for messages)
Dataset Language Type Topic Comment
EleutherAI/the_pile_deduplicated English text English corpus 15%, 16384
KORMo-Team/Cosmopedia-ko-synth Korean text Korean corpus 35%, 8192
Korean chat generated with Gemma 4 Korean messages Korean chat 35%, 8192
SWE-bench/SWE-smith-trajectories English messages Tool calling 15%, 8192
  • Create imatrix using llama-imatrix with --parse-special flag

  • Followed AesSedai's quantize method

export TYPE_FFN_UP_EXPS=IQ2_S
export TYPE_FFN_GATE_EXPS=IQ2_S
export TYPE_FFN_DOWN_EXPS=IQ3_S
export TYPE_DEFAULT=Q6_K

llama-quantize \
    --allow-requantize \
    --tensor-type ffn_up_exps=$TYPE_FFN_UP_EXPS \
    --tensor-type ffn_gate_exps=$TYPE_FFN_GATE_EXPS \
    --tensor-type ffn_down_exps=$TYPE_FFN_DOWN_EXPS \
    --imatrix $imatrix $gguf_path $TYPE_DEFAULT

Acknowledgement

Special thanks to AesSedai for implementing MiMo support for llama.cpp and quantize method.

Downloads last month
202
GGUF
Model size
310B params
Architecture
mimo2
Hardware compatibility
Log In to add your hardware

3-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hell0ks/MiMo-V2.5-GGUF-ko

Quantized
(12)
this model

Datasets used to train hell0ks/MiMo-V2.5-GGUF-ko