|
|
--- |
|
|
license: mit |
|
|
base_model: |
|
|
- zai-org/GLM-Image |
|
|
base_model_relation: quantized |
|
|
library_name: diffusers |
|
|
tags: |
|
|
- sdnq |
|
|
- 4-bit |
|
|
- glm-image |
|
|
--- |
|
|
Dynamic 4 bit quantization of [zai-org/GLM-Image](https://huggingface.co/zai-org/GLM-Image) using [SDNQ](https://github.com/vladmandic/sdnext/wiki/SDNQ-Quantization). |
|
|
|
|
|
This model uses per layer fine grained quantization. |
|
|
What dtype to use for a layer is selected dynamically by trial and error until the std normalized mse loss is lower than the selected threshold. |
|
|
|
|
|
Minimum allowed dtype is set to uint4 and std normalized mse loss threshold is set to 1e-2. |
|
|
This created a mixed precision model with uint4, int5 and float5_e3m1fn dtypes. |
|
|
SVD quantization is disabled. |
|
|
|
|
|
Usage: |
|
|
``` |
|
|
pip install sdnq |
|
|
``` |
|
|
|
|
|
```py |
|
|
import torch |
|
|
import diffusers |
|
|
from sdnq import SDNQConfig # import sdnq to register it into diffusers and transformers |
|
|
from sdnq.common import use_torch_compile as triton_is_available |
|
|
from sdnq.loader import apply_sdnq_options_to_model |
|
|
|
|
|
pipe = diffusers.GlmImagePipeline.from_pretrained("Disty0/GLM-Image-SDNQ-4bit-dynamic", torch_dtype=torch.bfloat16) |
|
|
|
|
|
# Enable INT8 MatMul for AMD, Intel ARC and Nvidia GPUs: |
|
|
if triton_is_available and (torch.cuda.is_available() or torch.xpu.is_available()): |
|
|
pipe.transformer = apply_sdnq_options_to_model(pipe.transformer, use_quantized_matmul=True) |
|
|
# pipe.transformer = torch.compile(pipe.transformer) # optional for faster speeds |
|
|
|
|
|
pipe.enable_model_cpu_offload() |
|
|
|
|
|
prompt = "A beautifully designed modern food magazine style dessert recipe illustration, themed around a raspberry mousse cake. The overall layout is clean and bright, divided into four main areas: the top left features a bold black title 'Raspberry Mousse Cake Recipe Guide', with a soft-lit close-up photo of the finished cake on the right, showcasing a light pink cake adorned with fresh raspberries and mint leaves; the bottom left contains an ingredient list section, titled 'Ingredients' in a simple font, listing 'Flour 150g', 'Eggs 3', 'Sugar 120g', 'Raspberry puree 200g', 'Gelatin sheets 10g', 'Whipping cream 300ml', and 'Fresh raspberries', each accompanied by minimalist line icons (like a flour bag, eggs, sugar jar, etc.); the bottom right displays four equally sized step boxes, each containing high-definition macro photos and corresponding instructions, arranged from top to bottom as follows: Step 1 shows a whisk whipping white foam (with the instruction 'Whip egg whites to stiff peaks'), Step 2 shows a red-and-white mixture being folded with a spatula (with the instruction 'Gently fold in the puree and batter'), Step 3 shows pink liquid being poured into a round mold (with the instruction 'Pour into mold and chill for 4 hours'), Step 4 shows the finished cake decorated with raspberries and mint leaves (with the instruction 'Decorate with raspberries and mint'); a light brown information bar runs along the bottom edge, with icons on the left representing 'Preparation time: 30 minutes', 'Cooking time: 20 minutes', and 'Servings: 8'. The overall color scheme is dominated by creamy white and light pink, with a subtle paper texture in the background, featuring compact and orderly text and image layout with clear information hierarchy." |
|
|
image = pipe( |
|
|
prompt=prompt, |
|
|
height=32 * 32, |
|
|
width=36 * 32, |
|
|
num_inference_steps=50, |
|
|
guidance_scale=1.5, |
|
|
generator=torch.manual_seed(42), |
|
|
).images[0] |
|
|
image.save("output_t2i-sdnq.png") |
|
|
``` |
|
|
|