MERaLiON-2-3B-RotorQuant-MLX-2bit / generation_config.json
majentik's picture
Add MLX 2-bit quantized model with KV cache compression
0f832ed verified
raw
history blame contribute delete
197 Bytes
{
"_from_model_config": true,
"bos_token_id": 2,
"cache_implementation": "hybrid",
"eos_token_id": 1,
"no_repeat_ngram_size": 6,
"pad_token_id": 0,
"transformers_version": "4.50.1"
}