GGUF

by nelenpt - opened Dec 20, 2025

Dec 20, 2025

Hi! I’m trying to run this locally.
I can load the HF checkpoint, but when converting to GGUF via llama.cpp’s convert_hf_to_gguf.py and running with llama.cpp (including Vulkan on AMD), the output degenerates into repeated tokens like mamama / plex / rients.
This looks like a tokenizer/vocab mismatch in the GGUF conversion pipeline.
Could you please publish an official GGUF (F16 + Q4_K_M) or provide the exact conversion steps/tooling you used that produces a working GGUF for LuxLlama?
Thanks!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment