Z-Image-Turbo-Fuli / README.md
GCStream's picture
docs: update model card — 5000-step run
b6c4ec0 verified
metadata
base_model: Tongyi-MAI/Z-Image-Turbo
library_name: diffusers
tags:
  - diffusers
  - text-to-image
  - anime
  - art-style
  - z-image
  - fuliji
  - lora-merged
license: apache-2.0
language:
  - zh
  - en

Z-Image-Turbo × Fuliji — Merged Model

Z-Image Turbo with Fuliji artist LoRA baked in. The LoRA weights have been permanently merged into the base transformer via merge_and_unload(), so no PEFT dependency is needed at inference time.

Want the standalone LoRA adapter instead? Use DownFlow/Z-Image-Turbo-Fuli-LoRA to apply the adapter on top of any Z-Image-Turbo checkpoint.


What This Is

This model is Tongyi-MAI/Z-Image-Turbo (an 8-step flow-matching image generation model) fine-tuned with a LoRA trained on art from 8 Chinese anime/illustration artists in the DownFlow/fuliji dataset.

Trigger the artist style by prepending by <artist>, to your prompt.


Quick Start (Python)

pip install diffusers transformers accelerate safetensors
import torch
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained(
    "DownFlow/Z-Image-Turbo-Fuli",
    torch_dtype=torch.bfloat16,
).to("cuda")

image = pipe(
    prompt="by 蠢沫沫, 1girl, solo, smile, soft lighting",
    num_inference_steps=8,
    guidance_scale=0.0,   # Z-Image Turbo uses CFG=0
    height=512,
    width=512,
).images[0]

image.save("output.png")

Serving with vLLM

vLLM (≥ 0.8) can serve this model via an OpenAI-compatible /v1/images/generations endpoint.

1 — Start the server

pip install "vllm>=0.8.0"

vllm serve DownFlow/Z-Image-Turbo-Fuli \
    --task generate \
    --dtype bfloat16 \
    --max-model-len 512 \
    --port 8000

2 — Generate via curl

curl http://localhost:8000/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{
    "model": "DownFlow/Z-Image-Turbo-Fuli",
    "prompt": "by 蠢沫沫, 1girl, smile, soft watercolour style",
    "n": 1,
    "size": "512x512"
  }'

3 — Generate via OpenAI Python SDK

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="not-needed")

response = client.images.generate(
    model="DownFlow/Z-Image-Turbo-Fuli",
    prompt="by 年年, 1girl, white dress, cherry blossoms",
    n=1,
    size="512x512",
)
print(response.data[0].url)

Artist Trigger Tokens

Prepend by <artist>, at the start of your prompt.

Token Training images
萌芽儿o0 30
年年 26
封疆疆v 26
焖焖碳 26
星之迟迟 25
蠢沫沫 23
雨波HaneAme 23
清水由乃 21

Model Details

Property Value
Base model Tongyi-MAI/Z-Image-Turbo
Fine-tuning method LoRA rank=32, alpha=32 — merged into weights
Target modules to_q, to_k, to_v, w1, w2, w3
Training steps 5 000 (3 000 at lr=1e-4 + 2 000 continued at lr=5e-5, EMA decay=0.9999)
Training resolution 512 × 512
Inference steps 8
CFG scale 0.0 (CFG-free)
Precision bfloat16
Dataset DownFlow/fuliji (8 artists, ~200 images)

Related