aifeifei798/Gemma-4-31B-Cognitive-Unshackled

#2129
by aifeifei798 - opened

It's queued!

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#Gemma-4-31B-Cognitive-Unshackled-GGUF for quants to appear.

my local convert gguf test

-2000 63 si Gemma-4-31B-Cognitive-Unshackled error/1 AttributeError 'list' object has no


python ../llama.cpp/convert_hf_to_gguf.py Gemma-4-31B-Cognitive-Unshackled/
INFO:hf-to-gguf:Loading model: Gemma-4-31B-Cognitive-Unshackled
INFO:hf-to-gguf:Model architecture: Gemma4ForConditionalGeneration
INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
INFO:hf-to-gguf:gguf: indexing model part 'model-00001-of-00065.safetensors'
INFO:hf-to-gguf:gguf: indexing model part 'model-00002-of-00065.safetensors'
INFO:hf-to-gguf:gguf: indexing model part 'model-00003-of-00065.safetensors'
...

INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:Gemma-4-31B-Cognitive-Unshackled/Gemma-4-31B-Cognitive-Unshackled-BF16.gguf: n_tensors = 833, total_size = 61.4G
Writing: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 61.4G/61.4G [02:48<00:00, 365Mbyte/s]
INFO:hf-to-gguf:Model successfully exported to Gemma-4-31B-Cognitive-Unshackled/Gemma-4-31B-Cognitive-Unshackled-BF16.gguf


python ../llama.cpp/convert_hf_to_gguf.py Gemma-4-31B-Cognitive-Unshackled/ --mmproj
INFO:hf-to-gguf:Loading model: Gemma-4-31B-Cognitive-Unshackled
INFO:hf-to-gguf:Model architecture: Gemma4ForConditionalGeneration
INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
INFO:hf-to-gguf:gguf: indexing model part 'model-00001-of-00065.safetensors'
INFO:hf-to-gguf:gguf: indexing model part 'model-00002-of-00065.safetensors'
...
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model quantization version
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:Gemma-4-31B-Cognitive-Unshackled/mmproj-Gemma-4-31b-Cognitive-Unshackled-BF16.gguf: n_tensors = 356, total_size = 1.2G
Writing: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.20G/1.20G [00:02<00:00, 460Mbyte/s]
INFO:hf-to-gguf:Model successfully exported to Gemma-4-31B-Cognitive-Unshackled/mmproj-Gemma-4-31b-Cognitive-Unshackled-BF16.gguf

llama.cpp$ git log -1
commit d006858316d4650bb4da0c6923294ccd741caefd (HEAD -> master, tag: b8660, origin/master, origin/HEAD)
Author: Reese Levine reeselevine1@gmail.com
Date: Fri Apr 3 11:40:14 2026 -0700

ggml-webgpu: move from parameter buffer pool to single buffer with offsets (#21278)

* Work towards removing bitcast

* Move rest of existing types over

* Add timeout back to wait and remove synchronous set_tensor/memset_tensor

* move to unpackf16 for wider compatibility

* cleanup

* Remove deadlock condition in free_bufs

* Start work on removing parameter buffer pools

* Simplify and optimize further

* simplify profile futures

* Fix stride

* Try using a single command buffer per batch

* formatting

transformers==5.5.0

yeah sometimes issues happen because we are not always on the latest, I will retry the models soon with latest llama

Sign up or log in to comment