https://huggingface.co/internlm/Intern-S2-Preview

#2395

by SkyMind - opened 11 days ago

Discussion

SkyMind

11 days ago

Looks quite interesting. qwen3.5 35B moe model outperforming Gemini-3.1-Flash on several benchmarks.

https://huggingface.co/internlm/Intern-S2-Preview

They provide https://huggingface.co/internlm/Intern-S2-Preview-FP8 quants if that's useful.

This gguf has observations which might be relevant: https://huggingface.co/crogers2287/Intern-S2-Preview-FP8-GGUF (appears to not have the Q4_K_M weights yet).

Thanks!

simonko912

11 days ago

It's queued!
You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#Intern-S2-Preview-GGUF for quants to appear.

SkyMind

4 days ago

Still not showing up on https://hf.tst.eu/model#Intern-S2-Preview-GGUF

RichardErkhov

4 days ago

ERROR:hf-to-gguf:Model InternS2PreviewForConditionalGeneration is not supported

Hey, the model doesnt seem to be supported by current llama cpp, at least not a week ago. Remind me in a few days, we will update llama cpp and try again, hopefully it will work =)

nicoboss

3 days ago

llama.cpp lacks support for the entire InternS2PreviewForConditionalGeneration architecture. We don't need to retry until the following search yields a result: https://github.com/search?q=repo%3Aggml-org%2Fllama.cpp+InternS2PreviewForConditionalGeneration&type=code

Thanks for the recommendation. I'm extremally interested in trying this model. It apparently supports vLLM so I will immediately run inference for it on Richard's supercomputer to try it out.

SkyMind

about 18 hours ago

•

edited about 18 hours ago

The conversion is described here:

https://huggingface.co/crogers2287/Intern-S2-Preview-FP8-GGUF/blob/main/docs/CONVERSION_AND_SERVING.md

and the patches are in https://huggingface.co/crogers2287/Intern-S2-Preview-FP8-GGUF/tree/main/patches

A higher level description is at the top of:
https://huggingface.co/crogers2287/Intern-S2-Preview-FP8-GGUF

RichardErkhov

about 18 hours ago

That's the problem, we are not working with forks of llama cpp, so until they merge it into main, we will not be able yo process it

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment