Z-Image-Turbo-Fuli / README.md
GCStream's picture
docs: update model card — 5000-step run
b6c4ec0 verified
---
base_model: Tongyi-MAI/Z-Image-Turbo
library_name: diffusers
tags:
- diffusers
- text-to-image
- anime
- art-style
- z-image
- fuliji
- lora-merged
license: apache-2.0
language:
- zh
- en
---
# Z-Image-Turbo × Fuliji — Merged Model
**Z-Image Turbo with Fuliji artist LoRA baked in.** The LoRA weights have been permanently merged into the base transformer via `merge_and_unload()`, so no PEFT dependency is needed at inference time.
> **Want the standalone LoRA adapter instead?**
> Use [DownFlow/Z-Image-Turbo-Fuli-LoRA](https://huggingface.co/DownFlow/Z-Image-Turbo-Fuli-LoRA) to apply the adapter on top of any Z-Image-Turbo checkpoint.
---
## What This Is
This model is [Tongyi-MAI/Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo) (an 8-step flow-matching image generation model) fine-tuned with a LoRA trained on art from 8 Chinese anime/illustration artists in the [DownFlow/fuliji](https://huggingface.co/datasets/DownFlow/fuliji) dataset.
Trigger the artist style by prepending `by <artist>,` to your prompt.
---
## Quick Start (Python)
```bash
pip install diffusers transformers accelerate safetensors
```
```python
import torch
from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained(
"DownFlow/Z-Image-Turbo-Fuli",
torch_dtype=torch.bfloat16,
).to("cuda")
image = pipe(
prompt="by 蠢沫沫, 1girl, solo, smile, soft lighting",
num_inference_steps=8,
guidance_scale=0.0, # Z-Image Turbo uses CFG=0
height=512,
width=512,
).images[0]
image.save("output.png")
```
---
## Serving with vLLM
vLLM (≥ 0.8) can serve this model via an OpenAI-compatible `/v1/images/generations` endpoint.
### 1 — Start the server
```bash
pip install "vllm>=0.8.0"
vllm serve DownFlow/Z-Image-Turbo-Fuli \
--task generate \
--dtype bfloat16 \
--max-model-len 512 \
--port 8000
```
### 2 — Generate via curl
```bash
curl http://localhost:8000/v1/images/generations \
-H "Content-Type: application/json" \
-d '{
"model": "DownFlow/Z-Image-Turbo-Fuli",
"prompt": "by 蠢沫沫, 1girl, smile, soft watercolour style",
"n": 1,
"size": "512x512"
}'
```
### 3 — Generate via OpenAI Python SDK
```python
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="not-needed")
response = client.images.generate(
model="DownFlow/Z-Image-Turbo-Fuli",
prompt="by 年年, 1girl, white dress, cherry blossoms",
n=1,
size="512x512",
)
print(response.data[0].url)
```
---
## Artist Trigger Tokens
Prepend `by <artist>, ` at the start of your prompt.
| Token | Training images |
|---|---|
| `萌芽儿o0` | 30 |
| `年年` | 26 |
| `封疆疆v` | 26 |
| `焖焖碳` | 26 |
| `星之迟迟` | 25 |
| `蠢沫沫` | 23 |
| `雨波HaneAme` | 23 |
| `清水由乃` | 21 |
---
## Model Details
| Property | Value |
|---|---|
| Base model | `Tongyi-MAI/Z-Image-Turbo` |
| Fine-tuning method | LoRA rank=32, alpha=32 — merged into weights |
| Target modules | `to_q`, `to_k`, `to_v`, `w1`, `w2`, `w3` |
| Training steps | **5 000** (3 000 at lr=1e-4 + 2 000 continued at lr=5e-5, EMA decay=0.9999) |
| Training resolution | 512 × 512 |
| Inference steps | 8 |
| CFG scale | 0.0 (CFG-free) |
| Precision | bfloat16 |
| Dataset | [DownFlow/fuliji](https://huggingface.co/datasets/DownFlow/fuliji) (8 artists, ~200 images) |
---
## Related
- [DownFlow/Z-Image-Turbo-Fuli-LoRA](https://huggingface.co/DownFlow/Z-Image-Turbo-Fuli-LoRA) — standalone LoRA adapter
- [DownFlow/fuliji](https://huggingface.co/datasets/DownFlow/fuliji) — training dataset
- [Tongyi-MAI/Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo) — base model