Request: GGUF / quantized weights for Intern-S1-Pro

#7
by gileneo - opened

Hi InternLM team & community,

Is there any plan to release GGUF or other quantized weights (INT4/INT8, AWQ, GPTQ, MLX) for Intern-S1-Pro?

Given that Intern-S1-Pro is a MoE model (~1T total params, ~22B active), a GGUF quant with proper expert routing would make it much more accessible for local inference and research, especially on large-memory systems (e.g. Mac Studio 512 GB RAM version).

Even an experimental / research-grade GGUF (or guidance on recommended quantization settings for MoE experts) would be extremely valuable for the community.

Thanks a lot for the great work on InternLM models!

Sign up or log in to comment