Tokenizer problems, or just quants?

#105
by Nesy1 - opened

Hi there! I am an avid user of Gemma4, especially its finetunes.

Across multiple finetunes, and even the base. I notice Gemma outputting weird korean letters and chineese letters, happens rarely but enough to be something that is needed to be addressed. Is this due to a broken template or a bug, or is this a me problem?

Paramaters:

  1. Q4_K_M quantization (imatrix). Mainly bartwoski. https://huggingface.co/bartowski/google_gemma-4-31B-it-GGUF
  2. 65K context (F16) (SWA ON)
  3. Koboldcpp backend/front-end, combined with SillyTavern. Chat completions.
  4. Neutralized samplers, temperature 1, but min_p: 0.05 to combat the weird symbols. No rep_pen or DRY.
  5. Finetunes used too, which I also see issues with, but rarely happen: https://huggingface.co/Nimbz/Gemma-4-Gembrain-31B-GGUF

Hardware paramaters you guys could work with:

  1. 5060TI (pcie 8x 4.0) 16GB
  2. 3060 (pcie 4x 3.0) 12GB
  3. CPU: R5 5600X
  4. RAM: 32GB DDR4

What do you think the issue is? Is it quantization? Or is it user error due to samplers or simple hardware bugs? Let me know guys! If you think my params need changing, im happy to take suggestions.

Sign up or log in to comment