Qwen-Image-1.9 / README.md

ThirdMiddle

Add model card (run prod-20260407)

7a977ae verified 5 days ago

preview code

raw

history blame contribute delete

3.1 kB

metadata

license: apache-2.0
base_model:
  - Qwen/Qwen-Image-2512
  - Qwen/Qwen-Image-Edit-2511
  - Qwen/Qwen-Image
tags:
  - image-generation
  - qwen
  - mmdit
  - abliterated
  - quantized
  - rocm
language:
  - en
library_name: diffusers
pipeline_tag: text-to-image

Qwen-Image-1.9

A merged, abliterated, and quantized derivative of the Qwen-Image 20B MMDiT family.

Run ID: prod-20260407 Created: 2026-04-07T18:59:37+00:00

Architecture

Property	Value
Base family	Qwen-Image (MMDiT 20B)
Text encoder	Qwen2.5-VL
VAE	RGB-VAE
RoPE	2D
Backbone parameters	~20B
License	Apache-2.0

Source Models

Alias	Model	Role	License
`qwen-image-2512`	Qwen/Qwen-Image-2512	foundation	Apache-2.0
`qwen-image-base`	Qwen/Qwen-Image	ancestry-base	Apache-2.0
`qwen-image-edit-2511`	Qwen/Qwen-Image-Edit-2511	edit-donor	Apache-2.0
`qwen-image-layered`	Qwen/Qwen-Image-Layered	layer-logic-donor	Apache-2.0

Research Method

1. Delta-Edit Merge

The edit capability is transferred to the foundation model via a controlled delta injection:

edit_delta = Qwen-Image-Edit-2511 − Qwen-Image (delta base)
merged     = Qwen-Image-2512 + 0.35 × edit_delta

Only MMDiT backbone tensors are blended. Text encoder, VAE, and RoPE components are passed through from the foundation checkpoint unchanged.

Strategy: slerp
Blend coefficient: 0.35
Foundation: Qwen/Qwen-Image-2512
Excluded subsystems: text_encoder, vae, rope

2. Abliteration (Refusal-Direction Removal)

Refusal-direction vectors are identified in the residual stream and projected out of target weight matrices using a norm-preserving orthogonal projection:

W′ = W − scale × (W @ r̂) ⊗ r̂    (norm-preserving variant)

Target layers: 18+ (attention o_proj + MLP down_proj)
Scale: 1.0
Mode: norm-preserving (preserves weight magnitude distribution)
Recipe: stage-3-abliteration.yaml

3. Quantization

Kind	Path
`quant_config`	`quant-config.json`

GGUF targets: Q4_K_M, IQ4_XS (with importance-matrix)
EXL2 target: 4.0 bpw
Runtime: vLLM-Omni (ROCm), ExLlamaV2

Hardware

GPU: AMD Instinct MI300X — 192 GB HBM3 VRAM
ROCm: 7.2.0
Precision: bf16 (merge + abliterate), quantized (deployment)

Usage

from diffusers import DiffusionPipeline
import torch

pipe = DiffusionPipeline.from_pretrained(
    "ThirdMiddle/Qwen-Image-1.9",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
)
pipe = pipe.to("cuda")

image = pipe(
    "a photorealistic portrait of an astronaut on Mars at sunrise",
    num_inference_steps=30,
    guidance_scale=4.0,
).images[0]
image.save("output.png")

License

Apache-2.0 — inherited from all source models.