Disty0
/

GLM-Image-SDNQ-4bit-dynamic

+---
+license: mit
+base_model:
+- zai-org/GLM-Image
+base_model_relation: quantized
+library_name: diffusers
+tags:
+- sdnq
+- 4-bit
+- glm-image
+---
+Dynamic 4 bit quantization of [zai-org/GLM-Image](https://huggingface.co/zai-org/GLM-Image) using [SDNQ](https://github.com/vladmandic/sdnext/wiki/SDNQ-Quantization).
+This model uses per layer fine grained quantization.
+What dtype to use for a layer is selected dynamically by trial and error until the std normalized mse loss is lower than the selected threshold.
+Minimum allowed dtype is set to uint4 and std normalized mse loss threshold is set to 1e-2.
+This created a mixed precision model with uint4, int5 and float5_e3m1fn dtypes.
+SVD quantization is disabled.
+Usage:
+```
+pip install sdnq
+```
+```py
+import torch
+import diffusers
+from sdnq import SDNQConfig # import sdnq to register it into diffusers and transformers
+from sdnq.common import use_torch_compile as triton_is_available
+from sdnq.loader import apply_sdnq_options_to_model
+pipe = diffusers.GlmImagePipeline.from_pretrained("zai-org/GLM-Image", torch_dtype=torch.bfloat16, device_map="cuda")
+# Enable INT8 MatMul for AMD, Intel ARC and Nvidia GPUs:
+if triton_is_available and (torch.cuda.is_available() or torch.xpu.is_available()):
+    pipe.transformer = apply_sdnq_options_to_model(pipe.transformer, use_quantized_matmul=True)
+    # pipe.transformer = torch.compile(pipe.transformer) # optional for faster speeds
+prompt = "A beautifully designed modern food magazine style dessert recipe illustration, themed around a raspberry mousse cake. The overall layout is clean and bright, divided into four main areas: the top left features a bold black title 'Raspberry Mousse Cake Recipe Guide', with a soft-lit close-up photo of the finished cake on the right, showcasing a light pink cake adorned with fresh raspberries and mint leaves; the bottom left contains an ingredient list section, titled 'Ingredients' in a simple font, listing 'Flour 150g', 'Eggs 3', 'Sugar 120g', 'Raspberry puree 200g', 'Gelatin sheets 10g', 'Whipping cream 300ml', and 'Fresh raspberries', each accompanied by minimalist line icons (like a flour bag, eggs, sugar jar, etc.); the bottom right displays four equally sized step boxes, each containing high-definition macro photos and corresponding instructions, arranged from top to bottom as follows: Step 1 shows a whisk whipping white foam (with the instruction 'Whip egg whites to stiff peaks'), Step 2 shows a red-and-white mixture being folded with a spatula (with the instruction 'Gently fold in the puree and batter'), Step 3 shows pink liquid being poured into a round mold (with the instruction 'Pour into mold and chill for 4 hours'), Step 4 shows the finished cake decorated with raspberries and mint leaves (with the instruction 'Decorate with raspberries and mint'); a light brown information bar runs along the bottom edge, with icons on the left representing 'Preparation time: 30 minutes', 'Cooking time: 20 minutes', and 'Servings: 8'. The overall color scheme is dominated by creamy white and light pink, with a subtle paper texture in the background, featuring compact and orderly text and image layout with clear information hierarchy."
+image = pipe(
+    prompt=prompt,
+    height=32 * 32,
+    width=36 * 32,
+    num_inference_steps=50,
+    guidance_scale=1.5,
+    generator=torch.manual_seed(42),
+).images[0]
+image.save("output_t2i-sdnq.png")
+```