Best Model Ever

#15
by yukiarimo - opened

Yo! This is the best model ever! I’ve just finished full SFT on my dataset and it worked! For the Qwen 4 VL 4B, please:

  1. Add 48 kHz audio input
  2. Do not add image or audio output. But if you do a TTS, please make in not diffusion, phoneme-based, and invent 48 kHz codec from scratch
  3. Still use full attention, but make it 10X faster on my MacBook
  4. Less RLHF. No NSFW restrictions or I’m AI output bullshit.
  5. Try to add less custom tokens in the vocabulary. I will not use them, anyway!
  6. Japanese, Russian, English, German, and French must be priority.
  7. Still keep releasing 4B and non-thinking ones. I hate CoT!

For robotics:

Here are the next steps for the Qwen team: build a like real-life human indistinguishable humanoid robot with all possible moments. User experiences should be like this and allow the following:

  1. Price must be <$1000
  2. When ordering, you can send the . blend file and they’ll apply it (and custom scaling)
  3. When initializing the model, must be a possibility to distill trained Qwen model’s knowledge into robot’s LAM
  4. After, it must be dynamic enough to allow real time learning and memory.

Thanks!

Sign up or log in to comment