new models?

#5
by jacek2024 - opened

hello, could you try your process with the new models, like https://huggingface.co/nvidia/Llama-3_3-Nemotron-Super-49B-v1_5 or some MoEs?

Yes, the nemotrons are very interesting, I made an experimental tune of the 51B version here:
https://huggingface.co/Sicarius-Prototyping/Turbo_Grammar_51B_Alpha

And currently doing tuning and testing of a pruned Qwen3 moe that will take a few more days.
However, from initial testing, the Qwen3 moe got some repetition issues, I'll try to see if those can be fixed.

Regarding the 49B, it got my attention for quite some time, so I might eventually get to tuning it too.

Thank you for your interest :)

my point is that there are many 70B finetunes on HF but not many 49B while 49B is just improved 70B :)

Sign up or log in to comment