Model Request

#1320

by mshojaei77 - opened Aug 24, 2025

Discussion

mshojaei77

Aug 24, 2025

could you please quant this model
mshojaei77/gemma-3n-E4B-persian

nicoboss

Aug 24, 2025

It's queued! :D

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#gemma-3n-E4B-persian-GGUF for quants to appear.

nicoboss

Aug 24, 2025

This model is already quantized to uint8 and so can't be quantized into a GGUF:

WARNING:hf-to-gguf:ignore token 262399: id is out of range, max=262143
INFO:hf-to-gguf:token_embd.weight,                 torch.float16 --> F16, shape = {2048, 262144}
INFO:hf-to-gguf:blk.0.altup_correct_scale.weight,  torch.float16 --> F32, shape = {2048}
INFO:hf-to-gguf:blk.0.altup_correct_coef.weight,   torch.float16 --> F32, shape = {4, 4}
INFO:hf-to-gguf:blk.0.altup_router.weight,         torch.float16 --> F16, shape = {2048, 4}
INFO:hf-to-gguf:blk.0.altup_predict_coef.weight,   torch.float16 --> F32, shape = {4, 16}
INFO:hf-to-gguf:blk.0.altup_router_norm.weight,    torch.float32 --> F32, shape = {2048}
INFO:hf-to-gguf:blk.0.attn_norm.weight,            torch.float32 --> F32, shape = {2048}
WARNING:hf-to-gguf:Cannot find destination type matching torch.uint8: Using F16
INFO:hf-to-gguf:blk.0.laurel_l.weight,             torch.uint8 --> F16, shape = {1, 65536}
Traceback (most recent call last):
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 8833, in <module>
    main()
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 8827, in main
    model_instance.write()
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 441, in write
    self.prepare_tensors()
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 298, in prepare_tensors
    for new_name, data_torch in (self.modify_tensors(data_torch, name, bid)):
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 5210, in modify_tensors
    return super().modify_tensors(data_torch, name, bid)
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 5064, in modify_tensors
    return [(self.map_tensor_name(name), data_torch)]
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 257, in map_tensor_name
    raise ValueError(f"Can not map tensor {name!r}")
ValueError: Can not map tensor 'model.layers.0.laurel.linear_left.weight.absmax'
job finished, status 1
job-done<0 gemma-3n-E4B-persian noquant 1>

error/1 ValueError Can not map tensor '
https://huggingface.co/mshojaei77/gemma-3n-E4B-persia

nicoboss

Aug 24, 2025

I just checked config.yaml of the model and it indeed is already bitsandbytes quantized:

"quantization_config": {
  "bnb_4bit_compute_dtype": "float16",
  "bnb_4bit_quant_type": "nf4",
  "bnb_4bit_use_double_quant": true,
  "llm_int8_enable_fp32_cpu_offload": false,
  "llm_int8_has_fp16_weight": false,
  "llm_int8_skip_modules": null,
  "llm_int8_threshold": 6.0,
  "load_in_4bit": true,
  "load_in_8bit": false,
  "quant_method": "bitsandbytes"
},

nicoboss

Aug 24, 2025

I noticed that you are the author of this model so maybe it would be possible to upload an unquantized version so it can be convearted into GGUF?

mshojaei77

Sep 14, 2025

I noticed that you are the author of this model so maybe it would be possible to upload an unquantized version so it can be convearted into GGUF?

thank you for checking it out, sure I will do that 👌

nicoboss

Sep 21, 2025

thank you for checking it out, sure I will do that 👌

Great. Please let me know once you have done so and we will quantize your model.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment