This model is a MoE created by merging a bunch of Qwen3 1.7B variants totalling 32 experts. It has experts which span across logic, reasoning, coding, mathematics, psychology, conversation, creativity and roleplaying.
This model runs pretty fast, I get 30 t/s when offloading entirely to CPU but your usage might vary.
As a Qwen3 model it has the capability to use think blocks which can be disabled by entering "/no_think" into the system prompt.
- Downloads last month
- 22
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support