https://huggingface.co/huihui-ai/Huihui-Ministral-3-3B-Instruct-2512-abliterated

#1580
by Cardaun - opened

Here the exact error and I think I know why this is tokenizer error is happening. They abliterated the quantized model instead of the BF16 variant which was never meant to be used in llama.cpp. Some tensors are inside F8_E4M3 so I don't think there is an easy way for us to quantize this even when fixing the tokenizer issue until llama.cpp implements support to convert quantized Ministral-3 quants.

INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
Traceback (most recent call last):
  File "/llmjob/llama.cpp-nico/convert_hf_to_gguf.py", line 10530, in <module>
    main()
  File "/llmjob/llama.cpp-nico/convert_hf_to_gguf.py", line 10507, in main
    model_instance = model_class(dir_model, output_type, fname_out,
  File "/llmjob/llama.cpp-nico/convert_hf_to_gguf.py", line 2653, in __init__
    self.img_break_tok_id = self.get_token_id("[IMG_BREAK]")
  File "/llmjob/llama.cpp-nico/convert_hf_to_gguf.py", line 2667, in get_token_id
    added_tokens_decoder = json.load(f)['added_tokens_decoder']
KeyError: 'added_tokens_decoder'
job finished, status 1
job-done<0 Huihui-Ministral-3-3B-Instruct-2512-abliterated noquant 1>

error/1 KeyError: 'added_tokens_decoder'

Sign up or log in to comment