Request: GGUF / quantized weights for Intern-S1-Pro
#7
by
gileneo
- opened
Hi InternLM team & community,
Is there any plan to release GGUF or other quantized weights (INT4/INT8, AWQ, GPTQ, MLX) for Intern-S1-Pro?
Given that Intern-S1-Pro is a MoE model (~1T total params, ~22B active), a GGUF quant with proper expert routing would make it much more accessible for local inference and research, especially on large-memory systems (e.g. Mac Studio 512 GB RAM version).
Even an experimental / research-grade GGUF (or guidance on recommended quantization settings for MoE experts) would be extremely valuable for the community.
Thanks a lot for the great work on InternLM models!