Best Model Ever
#15
by
yukiarimo
- opened
Yo! This is the best model ever! I’ve just finished full SFT on my dataset and it worked! For the Qwen 4 VL 4B, please:
- Add 48 kHz audio input
- Do not add image or audio output. But if you do a TTS, please make in not diffusion, phoneme-based, and invent 48 kHz codec from scratch
- Still use full attention, but make it 10X faster on my MacBook
- Less RLHF. No NSFW restrictions or I’m AI output bullshit.
- Try to add less custom tokens in the vocabulary. I will not use them, anyway!
- Japanese, Russian, English, German, and French must be priority.
- Still keep releasing 4B and non-thinking ones. I hate CoT!
For robotics:
Here are the next steps for the Qwen team: build a like real-life human indistinguishable humanoid robot with all possible moments. User experiences should be like this and allow the following:
- Price must be <$1000
- When ordering, you can send the . blend file and they’ll apply it (and custom scaling)
- When initializing the model, must be a possibility to distill trained Qwen model’s knowledge into robot’s LAM
- After, it must be dynamic enough to allow real time learning and memory.
Thanks!