aifeifei798/Gemma-4-31B-Cognitive-Unshackled
It's queued!
You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#Gemma-4-31B-Cognitive-Unshackled-GGUF for quants to appear.
my local convert gguf test
-2000 63 si Gemma-4-31B-Cognitive-Unshackled error/1 AttributeError 'list' object has no
python ../llama.cpp/convert_hf_to_gguf.py Gemma-4-31B-Cognitive-Unshackled/
INFO:hf-to-gguf:Loading model: Gemma-4-31B-Cognitive-Unshackled
INFO:hf-to-gguf:Model architecture: Gemma4ForConditionalGeneration
INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
INFO:hf-to-gguf:gguf: indexing model part 'model-00001-of-00065.safetensors'
INFO:hf-to-gguf:gguf: indexing model part 'model-00002-of-00065.safetensors'
INFO:hf-to-gguf:gguf: indexing model part 'model-00003-of-00065.safetensors'
...
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:Gemma-4-31B-Cognitive-Unshackled/Gemma-4-31B-Cognitive-Unshackled-BF16.gguf: n_tensors = 833, total_size = 61.4G
Writing: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 61.4G/61.4G [02:48<00:00, 365Mbyte/s]
INFO:hf-to-gguf:Model successfully exported to Gemma-4-31B-Cognitive-Unshackled/Gemma-4-31B-Cognitive-Unshackled-BF16.gguf
python ../llama.cpp/convert_hf_to_gguf.py Gemma-4-31B-Cognitive-Unshackled/ --mmproj
INFO:hf-to-gguf:Loading model: Gemma-4-31B-Cognitive-Unshackled
INFO:hf-to-gguf:Model architecture: Gemma4ForConditionalGeneration
INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
INFO:hf-to-gguf:gguf: indexing model part 'model-00001-of-00065.safetensors'
INFO:hf-to-gguf:gguf: indexing model part 'model-00002-of-00065.safetensors'
...
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model quantization version
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:Gemma-4-31B-Cognitive-Unshackled/mmproj-Gemma-4-31b-Cognitive-Unshackled-BF16.gguf: n_tensors = 356, total_size = 1.2G
Writing: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1.20G/1.20G [00:02<00:00, 460Mbyte/s]
INFO:hf-to-gguf:Model successfully exported to Gemma-4-31B-Cognitive-Unshackled/mmproj-Gemma-4-31b-Cognitive-Unshackled-BF16.gguf
llama.cpp$ git log -1
commit d006858316d4650bb4da0c6923294ccd741caefd (HEAD -> master, tag: b8660, origin/master, origin/HEAD)
Author: Reese Levine reeselevine1@gmail.com
Date: Fri Apr 3 11:40:14 2026 -0700
ggml-webgpu: move from parameter buffer pool to single buffer with offsets (#21278)
* Work towards removing bitcast
* Move rest of existing types over
* Add timeout back to wait and remove synchronous set_tensor/memset_tensor
* move to unpackf16 for wider compatibility
* cleanup
* Remove deadlock condition in free_bufs
* Start work on removing parameter buffer pools
* Simplify and optimize further
* simplify profile futures
* Fix stride
* Try using a single command buffer per batch
* formatting
transformers==5.5.0
yeah sometimes issues happen because we are not always on the latest, I will retry the models soon with latest llama