Text-to-Image
Transformers
Safetensors
Hunyuan
text-generation
hunyuan
quantization
nf4
comfyui
custom-nodes
autoregressive
DiT
HunyuanImage-3.0
instruct
image-editing
bitsandbytes
4bit
distilled
custom_code
4-bit precision
Instructions to use EricRollei/HunyuanImage-3.0-Instruct-Distil-NF4-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use EricRollei/HunyuanImage-3.0-Instruct-Distil-NF4-v2 with Transformers:
# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("EricRollei/HunyuanImage-3.0-Instruct-Distil-NF4-v2", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
AssertionError: assert module.weight.shape[1] == 1 in fix_4bit_weight_quant_state_from_module during first inference step
#2
by KeizoMiyazawa - opened
AssertionError: assert module.weight.shape[1] == 1infix_4bit_weight_quant_state_from_module` during first inference step
Hi,
I'm getting a consistent error during the first inference step with the NF4 Instruct-Distil v2 model. The model loads successfully (70β80s, 46.7GB on GPU), but fails at step 0 of generation.
Environment:
- GPU: NVIDIA RTX PRO 5000 Blackwell, 48GB VRAM
- ComfyUI: 0.15.1
- PyTorch: 2.10.0+cu130
- Python: 3.13.11
- bitsandbytes: 0.48.2
- transformers: 5.2.0
- accelerate: 1.12.0
- Model:
HunyuanImage-3.0-Instruct-Distil-NF4-v2 - Node:
Comfy_HunyuanImage3(latest,git pullconfirmed up to date)
Error:
UserWarning: FP4 quantization state not initialized. Please call .cuda() or .to(device) on the LinearFP4 layer first.
AssertionError: assert module.weight.shape[1] == 1
File "bitsandbytes/nn/modules.py", line 407, in fix_4bit_weight_quant_state_from_module
Stack trace path:generate_image β pipeline β model forward β decoder_layer β mlp β shared_mlp β gate_and_up_proj β fix_4bit_weight_quant_state_from_module
What I've tried:
- Isolated to single GPU with
CUDA_VISIBLE_DEVICES=0 - Confirmed
bnb_4bit_quant_type: nf4andbnb_4bit_quant_storage: uint8in config.json - Confirmed
shared_mlpis inllm_int8_skip_modules - Rolled back bitsandbytes to 0.48.2 (was 0.49.2)
- Cleared HuggingFace modules cache
- Confirmed no conflicting custom nodes
Any idea what's causing shared_mlp.gate_and_up_proj to have an uninitialized quant state?
Thanks!