Request GGUF of the model

#11

by FlameF0X - opened 20 days ago

Discussion

FlameF0X

20 days ago

•

edited 20 days ago

Most people can't run mlx because they don't have a Apple device and gguf is universal

rethinkNow

Nemo Station org 16 days ago

Fair point, GGUF does reach more people. The blocker is M-RoPE: llama.cpp doesn't fps-scale the temporal positions (the temporal M-RoPE) for video, so the model can't place events in real time and the timestamps come out wrong. Since both Marlin's find and caption outputs are timestamped, that breaks the core feature for everyone on GGUF, not just edge cases. It's a llama.cpp runtime gap (hits all Qwen3.5 VL GGUFs), and the fix has to land upstream. Until it does, grounding stays on MLX (Apple) or transformers/vLLM (any GPU/CPU), we'd rather not ship a universal GGUF that's universally wrong on the "when".

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment