Quantization request: RyanMercier/deepseek-roleplay-dippy-7b
It's queued! :D
You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#deepseek-roleplay-dippy-7b-GGUF for quants to appear.
This model unfortunately failed with the following error during SafeTensors to GGUF conversion:
WARNING:hf-to-gguf:Cannot find destination type matching torch.uint8: Using F16
INFO:hf-to-gguf:blk.0.ffn_down.weight, torch.uint8 --> F16, shape = {1, 33947648}
Traceback (most recent call last):
File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 9178, in <module>
main()
File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 9172, in main
model_instance.write()
File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 439, in write
self.prepare_tensors()
File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 300, in prepare_tensors
for new_name, data_torch in (self.modify_tensors(data_torch, name, bid)):
File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 3061, in modify_tensors
yield from super().modify_tensors(data_torch, name, bid)
File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 268, in modify_tensors
return [(self.map_tensor_name(name), data_torch)]
File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 259, in map_tensor_name
raise ValueError(f"Can not map tensor {name!r}")
ValueError: Can not map tensor 'model.layers.0.mlp.down_proj.weight.absmax'
job finished, status 1
job-done<0 deepseek-roleplay-dippy-7b noquant 1>
error/1 ValueError Can not map tensor '
https://huggingface.co/RyanMercier/deepseek-roleplay-dippy-7b
Thank you. I also encountered this problem. I didn't expect you couldn't solve it either.
Thank you. I also encountered this problem. I didn't expect you couldn't solve it either.
The error we got means the model you wanted us to quantize is already quantized. model.layers.0.mlp.down_proj.weight.absmax should only exist in already quantized models. It is not possible to GGUF quantize already quantized models. You could ask the model author to upload an unquantized version of this model.
I just looked at the model and it seams to be simularely quantized like the huge DeepSeek models. I will try to manually conveart using the compilade/convert-prequant branch as I did today for the huge DeepSeek-V3.1-Terminus model.
I just tried and no unfortinately the model author did not use the same quants as big DeepSeek or any other llama.cpp compatible quants:
root@AI:/apool/llama.cpp# venv/bin/python convert_hf_to_gguf.py /bpool/deepseek-roleplay-dippy-7b --outtype=f16 --outfile=/mradermacher/tmp/quant/deepseek-roleplay-dippy-7b.gguf
INFO:hf-to-gguf:Loading model: deepseek-roleplay-dippy-7b
INFO:hf-to-gguf:Model architecture: Qwen2ForCausalLM
INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
INFO:hf-to-gguf:gguf: indexing model part 'model-00001-of-00002.safetensors'
INFO:hf-to-gguf:gguf: indexing model part 'model-00002-of-00002.safetensors'
Traceback (most recent call last):
File "/apool/llama.cpp/convert_hf_to_gguf.py", line 9115, in <module>
main()
File "/apool/llama.cpp/convert_hf_to_gguf.py", line 9093, in main
model_instance = model_class(dir_model, output_type, fname_out,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/apool/llama.cpp/convert_hf_to_gguf.py", line 633, in __init__
super().__init__(*args, **kwargs)
File "/apool/llama.cpp/convert_hf_to_gguf.py", line 132, in __init__
self.dequant_model()
File "/apool/llama.cpp/convert_hf_to_gguf.py", line 352, in dequant_model
raise NotImplementedError(f"Quant method is not yet supported: {quant_method!r}")
NotImplementedError: Quant method is not yet supported: 'bitsandbytes'
I just tried and no unfortinately the model author did not use the same quants as big DeepSeek or any other llama.cpp compatible quants:
root@AI:/apool/llama.cpp# venv/bin/python convert_hf_to_gguf.py /bpool/deepseek-roleplay-dippy-7b --outtype=f16 --outfile=/mradermacher/tmp/quant/deepseek-roleplay-dippy-7b.gguf INFO:hf-to-gguf:Loading model: deepseek-roleplay-dippy-7b INFO:hf-to-gguf:Model architecture: Qwen2ForCausalLM INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json' INFO:hf-to-gguf:gguf: indexing model part 'model-00001-of-00002.safetensors' INFO:hf-to-gguf:gguf: indexing model part 'model-00002-of-00002.safetensors' Traceback (most recent call last): File "/apool/llama.cpp/convert_hf_to_gguf.py", line 9115, in <module> main() File "/apool/llama.cpp/convert_hf_to_gguf.py", line 9093, in main model_instance = model_class(dir_model, output_type, fname_out, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/apool/llama.cpp/convert_hf_to_gguf.py", line 633, in __init__ super().__init__(*args, **kwargs) File "/apool/llama.cpp/convert_hf_to_gguf.py", line 132, in __init__ self.dequant_model() File "/apool/llama.cpp/convert_hf_to_gguf.py", line 352, in dequant_model raise NotImplementedError(f"Quant method is not yet supported: {quant_method!r}") NotImplementedError: Quant method is not yet supported: 'bitsandbytes'