Disty0
/

GLM-Image-SDNQ-4bit-dynamic

GlmImagePipeline

4-bit precision

Model card Files Files and versions

GLM-Image-SDNQ-4bit-dynamic / README.md

Disty0's picture

Update README.md

df50075 verified about 10 hours ago

|

history blame contribute delete

3.45 kB

	---
	license: mit
	base_model:
	- zai-org/GLM-Image
	base_model_relation: quantized
	library_name: diffusers
	tags:
	- sdnq
	- 4-bit
	- glm-image
	---
	Dynamic 4 bit quantization of [zai-org/GLM-Image](https://huggingface.co/zai-org/GLM-Image) using [SDNQ](https://github.com/vladmandic/sdnext/wiki/SDNQ-Quantization).

	This model uses per layer fine grained quantization.
	What dtype to use for a layer is selected dynamically by trial and error until the std normalized mse loss is lower than the selected threshold.

	Minimum allowed dtype is set to uint4 and std normalized mse loss threshold is set to 1e-2.
	This created a mixed precision model with uint4, int5 and float5_e3m1fn dtypes.
	SVD quantization is disabled.

	Usage:
	```
	pip install sdnq
	```

	```py
	import torch
	import diffusers
	from sdnq import SDNQConfig # import sdnq to register it into diffusers and transformers
	from sdnq.common import use_torch_compile as triton_is_available
	from sdnq.loader import apply_sdnq_options_to_model

	pipe = diffusers.GlmImagePipeline.from_pretrained("Disty0/GLM-Image-SDNQ-4bit-dynamic", torch_dtype=torch.bfloat16)

	# Enable INT8 MatMul for AMD, Intel ARC and Nvidia GPUs:
	if triton_is_available and (torch.cuda.is_available() or torch.xpu.is_available()):
	pipe.transformer = apply_sdnq_options_to_model(pipe.transformer, use_quantized_matmul=True)
	# pipe.transformer = torch.compile(pipe.transformer) # optional for faster speeds

	pipe.enable_model_cpu_offload()

	prompt = "A beautifully designed modern food magazine style dessert recipe illustration, themed around a raspberry mousse cake. The overall layout is clean and bright, divided into four main areas: the top left features a bold black title 'Raspberry Mousse Cake Recipe Guide', with a soft-lit close-up photo of the finished cake on the right, showcasing a light pink cake adorned with fresh raspberries and mint leaves; the bottom left contains an ingredient list section, titled 'Ingredients' in a simple font, listing 'Flour 150g', 'Eggs 3', 'Sugar 120g', 'Raspberry puree 200g', 'Gelatin sheets 10g', 'Whipping cream 300ml', and 'Fresh raspberries', each accompanied by minimalist line icons (like a flour bag, eggs, sugar jar, etc.); the bottom right displays four equally sized step boxes, each containing high-definition macro photos and corresponding instructions, arranged from top to bottom as follows: Step 1 shows a whisk whipping white foam (with the instruction 'Whip egg whites to stiff peaks'), Step 2 shows a red-and-white mixture being folded with a spatula (with the instruction 'Gently fold in the puree and batter'), Step 3 shows pink liquid being poured into a round mold (with the instruction 'Pour into mold and chill for 4 hours'), Step 4 shows the finished cake decorated with raspberries and mint leaves (with the instruction 'Decorate with raspberries and mint'); a light brown information bar runs along the bottom edge, with icons on the left representing 'Preparation time: 30 minutes', 'Cooking time: 20 minutes', and 'Servings: 8'. The overall color scheme is dominated by creamy white and light pink, with a subtle paper texture in the background, featuring compact and orderly text and image layout with clear information hierarchy."
	image = pipe(
	prompt=prompt,
	height=32 * 32,
	width=36 * 32,
	num_inference_steps=50,
	guidance_scale=1.5,
	generator=torch.manual_seed(42),
	).images[0]
	image.save("output_t2i-sdnq.png")
	```