This model is a v2 variant of our ~14B parameter upscale of Qwen/Qwen3.5-9B. It has been merged with the base model to boost knowledge while keeping instruction following, because the base model was actually trained with some instruct-like examples. (v1: https://huggingface.co/Pinkstackorg/Qwen-3.5-upscaled-14B-noft)
compared to the first version, v2 has much better instruction following, less looping issues, and showcases stronger performance, usability, you should ideally fine-tune this variant, and not v1.
You should fine-tune this model for it to be fully useable, while it has potential and all capabilities are still intact including vision, note that it may sometimes have a different style of reasoning/COT, differently styled outputs because the model was merged with the base model.
Due to mergekit not supporting qwen3.5, we used a custom layer-wise merge method.
- Downloads last month
- 63