ThirdMiddle
/

Qwen-Image-1.9

image-generation

Model card Files Files and versions

Qwen-Image-1.9 / README.md

ThirdMiddle's picture

Add model card (run prod-20260407)

7a977ae verified 7 days ago

|

history blame contribute delete

3.1 kB

	---
	license: apache-2.0
	base_model:
	- Qwen/Qwen-Image-2512
	- Qwen/Qwen-Image-Edit-2511
	- Qwen/Qwen-Image
	tags:
	- image-generation
	- qwen
	- mmdit
	- abliterated
	- quantized
	- rocm
	language:
	- en
	library_name: diffusers
	pipeline_tag: text-to-image
	---

	# Qwen-Image-1.9

	A merged, abliterated, and quantized derivative of the Qwen-Image 20B MMDiT family.

	> Run ID: `prod-20260407`
	> Created: 2026-04-07T18:59:37+00:00

	## Architecture

	\| Property \| Value \|
	\| --- \| --- \|
	\| Base family \| Qwen-Image (MMDiT 20B) \|
	\| Text encoder \| Qwen2.5-VL \|
	\| VAE \| RGB-VAE \|
	\| RoPE \| 2D \|
	\| Backbone parameters \| ~20B \|
	\| License \| Apache-2.0 \|

	## Source Models

	\| Alias \| Model \| Role \| License \|
	\| --- \| --- \| --- \| --- \|
	\| `qwen-image-2512` \| [Qwen/Qwen-Image-2512](https://huggingface.co/Qwen/Qwen-Image-2512) \| foundation \| Apache-2.0 \|
	\| `qwen-image-base` \| [Qwen/Qwen-Image](https://huggingface.co/Qwen/Qwen-Image) \| ancestry-base \| Apache-2.0 \|
	\| `qwen-image-edit-2511` \| [Qwen/Qwen-Image-Edit-2511](https://huggingface.co/Qwen/Qwen-Image-Edit-2511) \| edit-donor \| Apache-2.0 \|
	\| `qwen-image-layered` \| [Qwen/Qwen-Image-Layered](https://huggingface.co/Qwen/Qwen-Image-Layered) \| layer-logic-donor \| Apache-2.0 \|

	## Research Method

	### 1. Delta-Edit Merge

	The edit capability is transferred to the foundation model via a controlled
	delta injection:

	```
	edit_delta = Qwen-Image-Edit-2511 − Qwen-Image (delta base)
	merged = Qwen-Image-2512 + 0.35 × edit_delta
	```

	Only MMDiT backbone tensors are blended. Text encoder, VAE, and RoPE
	components are passed through from the foundation checkpoint unchanged.

	- Strategy: `slerp`
	- Blend coefficient: `0.35`
	- Foundation: `Qwen/Qwen-Image-2512`
	- Excluded subsystems: text_encoder, vae, rope

	### 2. Abliteration (Refusal-Direction Removal)

	Refusal-direction vectors are identified in the residual stream and
	projected out of target weight matrices using a norm-preserving
	orthogonal projection:

	```
	W′ = W − scale × (W @ r̂) ⊗ r̂ (norm-preserving variant)
	```

	- Target layers: 18+ (attention o_proj + MLP down_proj)
	- Scale: 1.0
	- Mode: norm-preserving (preserves weight magnitude distribution)
	- Recipe: `stage-3-abliteration.yaml`

	### 3. Quantization

	\| Kind \| Path \|
	\| --- \| --- \|
	\| `quant_config` \| `quant-config.json` \|

	- GGUF targets: Q4_K_M, IQ4_XS (with importance-matrix)
	- EXL2 target: 4.0 bpw
	- Runtime: vLLM-Omni (ROCm), ExLlamaV2

	## Hardware

	- GPU: AMD Instinct MI300X — 192 GB HBM3 VRAM
	- ROCm: 7.2.0
	- Precision: bf16 (merge + abliterate), quantized (deployment)

	## Usage

	```python
	from diffusers import DiffusionPipeline
	import torch

	pipe = DiffusionPipeline.from_pretrained(
	"ThirdMiddle/Qwen-Image-1.9",
	torch_dtype=torch.bfloat16,
	trust_remote_code=True,
	)
	pipe = pipe.to("cuda")

	image = pipe(
	"a photorealistic portrait of an astronaut on Mars at sunrise",
	num_inference_steps=30,
	guidance_scale=4.0,
	).images[0]
	image.save("output.png")
	```

	## License

	Apache-2.0 — inherited from all source models.