Qwen3.5-Step-3.5-Flash-Distilled or More ?
#1
by
Rebis - opened
Hi,
Do you intend to train Qwen3.5-4B or 9B while retaining its multimodal aspect in the future with some of your datasets (Erudite,
HexaDomain, Step-3.5-Flash-Distilled, UltraThinker, ... ) ?
Thank you for everything
While the datasets I make are text-only, I will try to create a multimodal dataset and train Qwen3.5-4B or 9B on it.