Great work
I really like both 35B and 27B Darwin models and i must say they perform better than both parents as your results suggest, more suprising thing is, that distills didnt work well for me but those models do.
Have you considered contacting some renown quantizers like @bartowski to provide high quality GGUFs and maybe some specialized quants? I would love to see mixed precision quants from @ubergarm and @steampunque because especially qwen3.5 family Q4-Q6 quants are outpeforming traditional quants when mixing high precision input/output layers.
Also i would like to point to QwenPaw Flash 9B model, its qwen3.5 9B fine tune thats outpeforming base model a lot and may be a nice model to work with in your mother/father scenarios. With this model i stumbled upon this adapter thats supposed to do very well, but couldnt try it due to lack of ggufs: https://huggingface.co/jason1966/CoPaw-Flash-9B-DataAnalyst-LoRA
I really like both 35B and 27B Darwin models and i must say they perform better than both parents as your results suggest, more suprising thing is, that distills didnt work well for me but those models do.
Have you considered contacting some renown quantizers like @bartowski to provide high quality GGUFs and maybe some specialized quants? I would love to see mixed precision quants from @ubergarm and @steampunque because especially qwen3.5 family Q4-Q6 quants are outpeforming traditional quants when mixing high precision input/output layers.
Also i would like to point to QwenPaw Flash 9B model, its qwen3.5 9B fine tune thats outpeforming base model a lot and may be a nice model to work with in your mother/father scenarios. With this model i stumbled upon this adapter thats supposed to do very well, but couldnt try it due to lack of ggufs: https://huggingface.co/jason1966/CoPaw-Flash-9B-DataAnalyst-LoRA
Thank you very much for the thoughtful feedback. I really appreciate it.
I am especially glad to hear that the Darwin 35B and 27B models worked well for you, and it is very encouraging to hear that they felt stronger than the parent models in actual use.
Also, bartowski has already kindly created and released a full GGUF quant set for Darwin-35B-A3B-Opus, which I am very grateful for:
https://huggingface.co/bartowski/FINAL-Bench_Darwin-35B-A3B-Opus-GGUF
Your point about high-quality specialized quants and mixed-precision approaches is also very valuable. I agree that this direction is important, especially for models in the Qwen3.5 family.
And thank you as well for mentioning QwenPaw Flash 9B. It sounds very interesting, and I will definitely test it as a possible parent candidate in future Darwin experiments.
I truly appreciate your support, suggestions, and careful testing.