Quantization request: RyanMercier/deepseek-roleplay-dippy-7b

#1400
by fengpeisheng1 - opened

It's queued! :D

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#deepseek-roleplay-dippy-7b-GGUF for quants to appear.

This model unfortunately failed with the following error during SafeTensors to GGUF conversion:

WARNING:hf-to-gguf:Cannot find destination type matching torch.uint8: Using F16
INFO:hf-to-gguf:blk.0.ffn_down.weight,     torch.uint8 --> F16, shape = {1, 33947648}
Traceback (most recent call last):
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 9178, in <module>
    main()
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 9172, in main
    model_instance.write()
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 439, in write
    self.prepare_tensors()
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 300, in prepare_tensors
    for new_name, data_torch in (self.modify_tensors(data_torch, name, bid)):
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 3061, in modify_tensors
    yield from super().modify_tensors(data_torch, name, bid)
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 268, in modify_tensors
    return [(self.map_tensor_name(name), data_torch)]
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 259, in map_tensor_name
    raise ValueError(f"Can not map tensor {name!r}")
ValueError: Can not map tensor 'model.layers.0.mlp.down_proj.weight.absmax'
job finished, status 1
job-done<0 deepseek-roleplay-dippy-7b noquant 1>

error/1 ValueError Can not map tensor '
https://huggingface.co/RyanMercier/deepseek-roleplay-dippy-7b

Thank you. I also encountered this problem. I didn't expect you couldn't solve it either.

Thank you. I also encountered this problem. I didn't expect you couldn't solve it either.

The error we got means the model you wanted us to quantize is already quantized. model.layers.0.mlp.down_proj.weight.absmax should only exist in already quantized models. It is not possible to GGUF quantize already quantized models. You could ask the model author to upload an unquantized version of this model.

I just looked at the model and it seams to be simularely quantized like the huge DeepSeek models. I will try to manually conveart using the compilade/convert-prequant branch as I did today for the huge DeepSeek-V3.1-Terminus model.

I just tried and no unfortinately the model author did not use the same quants as big DeepSeek or any other llama.cpp compatible quants:

root@AI:/apool/llama.cpp# venv/bin/python convert_hf_to_gguf.py /bpool/deepseek-roleplay-dippy-7b --outtype=f16 --outfile=/mradermacher/tmp/quant/deepseek-roleplay-dippy-7b.gguf
INFO:hf-to-gguf:Loading model: deepseek-roleplay-dippy-7b
INFO:hf-to-gguf:Model architecture: Qwen2ForCausalLM
INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
INFO:hf-to-gguf:gguf: indexing model part 'model-00001-of-00002.safetensors'
INFO:hf-to-gguf:gguf: indexing model part 'model-00002-of-00002.safetensors'
Traceback (most recent call last):
  File "/apool/llama.cpp/convert_hf_to_gguf.py", line 9115, in <module>
    main()
  File "/apool/llama.cpp/convert_hf_to_gguf.py", line 9093, in main
    model_instance = model_class(dir_model, output_type, fname_out,
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/apool/llama.cpp/convert_hf_to_gguf.py", line 633, in __init__
    super().__init__(*args, **kwargs)
  File "/apool/llama.cpp/convert_hf_to_gguf.py", line 132, in __init__
    self.dequant_model()
  File "/apool/llama.cpp/convert_hf_to_gguf.py", line 352, in dequant_model
    raise NotImplementedError(f"Quant method is not yet supported: {quant_method!r}")
NotImplementedError: Quant method is not yet supported: 'bitsandbytes'

I just tried and no unfortinately the model author did not use the same quants as big DeepSeek or any other llama.cpp compatible quants:

root@AI:/apool/llama.cpp# venv/bin/python convert_hf_to_gguf.py /bpool/deepseek-roleplay-dippy-7b --outtype=f16 --outfile=/mradermacher/tmp/quant/deepseek-roleplay-dippy-7b.gguf
INFO:hf-to-gguf:Loading model: deepseek-roleplay-dippy-7b
INFO:hf-to-gguf:Model architecture: Qwen2ForCausalLM
INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
INFO:hf-to-gguf:gguf: indexing model part 'model-00001-of-00002.safetensors'
INFO:hf-to-gguf:gguf: indexing model part 'model-00002-of-00002.safetensors'
Traceback (most recent call last):
  File "/apool/llama.cpp/convert_hf_to_gguf.py", line 9115, in <module>
    main()
  File "/apool/llama.cpp/convert_hf_to_gguf.py", line 9093, in main
    model_instance = model_class(dir_model, output_type, fname_out,
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/apool/llama.cpp/convert_hf_to_gguf.py", line 633, in __init__
    super().__init__(*args, **kwargs)
  File "/apool/llama.cpp/convert_hf_to_gguf.py", line 132, in __init__
    self.dequant_model()
  File "/apool/llama.cpp/convert_hf_to_gguf.py", line 352, in dequant_model
    raise NotImplementedError(f"Quant method is not yet supported: {quant_method!r}")
NotImplementedError: Quant method is not yet supported: 'bitsandbytes'
fengpeisheng1 changed discussion status to closed

Sign up or log in to comment