generation_config.json · majentik/MERaLiON-2-3B-RotorQuant-MLX-2bit at main

Add MLX 2-bit quantized model with KV cache compression

0f832ed verified 2 days ago

197 Bytes

	{
	"_from_model_config": true,
	"bos_token_id": 2,
	"cache_implementation": "hybrid",
	"eos_token_id": 1,
	"no_repeat_ngram_size": 6,
	"pad_token_id": 0,
	"transformers_version": "4.50.1"
	}