Text-to-Image
Diffusers
Safetensors
English
image-generation
class-conditional
imagenet
pixelflow
flow-matching
Instructions to use BiliSakura/PixelFlow-diffusers with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use BiliSakura/PixelFlow-diffusers with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("BiliSakura/PixelFlow-diffusers", dtype=torch.bfloat16, device_map="cuda") prompt = "golden retriever" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Draw Things
- DiffusionBee
File size: 4,653 Bytes
4968e7f 098ef8f 4968e7f 098ef8f 4968e7f 098ef8f 4968e7f 098ef8f 4968e7f 098ef8f 4968e7f 098ef8f 4968e7f 098ef8f 4968e7f 098ef8f 4968e7f 098ef8f 4968e7f 098ef8f 4968e7f 098ef8f 4968e7f 098ef8f 4968e7f 098ef8f 4968e7f 098ef8f 4968e7f 098ef8f 4968e7f 098ef8f 4968e7f 098ef8f 4968e7f 098ef8f 4968e7f 098ef8f 4968e7f 098ef8f 4968e7f 098ef8f 4968e7f 098ef8f | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 | # BiliSakura/PixelFlow-diffusers
Self-contained PixelFlow checkpoints for Hugging Face diffusers. Each variant folder ships its own `pipeline.py`, component modules, and weights.
## Available checkpoints
| Subfolder | Pipeline | Task | Resolution | Params |
| --- | --- | --- | ---: | ---: |
| [`PixelFlow-256/`](PixelFlow-256/) | `PixelFlowPipeline` | class-to-image | 256Γ256 | 677M |
| [`PixelFlow-T2I/`](PixelFlow-T2I/) | `PixelFlowT2IPipeline` | text-to-image | 1024Γ1024 | 882M |
## Repo layout
```text
BiliSakura/PixelFlow-diffusers/
βββ README.md
βββ PixelFlow-256/
β βββ pipeline.py
β βββ model_index.json
β βββ scheduler/scheduler_config.json
β βββ transformer/
βββ PixelFlow-T2I/
βββ pipeline.py
βββ model_index.json
βββ scheduler/scheduler_config.json
βββ text_encoder/
βββ tokenizer/
βββ transformer/
```
Each variant is self-contained. The `scheduler/` folder contains `scheduler_config.json` and `scheduling_pixelflow.py` with [`PixelFlowScheduler`](PixelFlow-256/scheduler/scheduling_pixelflow.py).
No shared helper modules at inference time; only PyPI `diffusers` plus the local variant directory.
## ImageNet class labels
For class-conditional [`PixelFlow-256/`](PixelFlow-256/), `id2label` is embedded in `PixelFlow-256/model_index.json` (DiT-style).
- `pipe.id2label` β inspect id β English label correspondence
- `pipe.labels` β reverse map (English synonym β id)
- `pipe.get_label_ids("golden retriever")`
- `pipe(class_labels="golden retriever", ...)` β string labels resolved automatically
## Demo
Class-to-image:
```bash
python demo_inference_c2i.py
```
Text-to-image:
```bash
python demo_inference_t2i.py
```
## Load from a local clone
### Class-to-image (`PixelFlow-256`)
```python
from pathlib import Path
import torch
from diffusers import DiffusionPipeline
model_dir = Path("./PixelFlow-256").resolve()
pipe = DiffusionPipeline.from_pretrained(
str(model_dir),
local_files_only=True,
custom_pipeline=str(model_dir / "pipeline.py"),
trust_remote_code=True,
torch_dtype=torch.bfloat16,
)
pipe.to("cuda")
print(pipe.id2label[207])
print(pipe.get_label_ids("golden retriever"))
generator = torch.Generator(device="cuda").manual_seed(42)
image = pipe(
class_labels="golden retriever",
height=256,
width=256,
num_inference_steps=[10, 10, 10, 10],
guidance_scale=4.0,
generator=generator,
).images[0]
image.save("demo.png")
```
### Text-to-image (`PixelFlow-T2I`)
Uses [`google/flan-t5-xl`](https://huggingface.co/google/flan-t5-xl) when `text_encoder/` is not bundled.
```python
from pathlib import Path
import torch
from diffusers import DiffusionPipeline
model_dir = Path("./PixelFlow-T2I").resolve()
pipe = DiffusionPipeline.from_pretrained(
str(model_dir),
local_files_only=True,
custom_pipeline=str(model_dir / "pipeline.py"),
trust_remote_code=True,
torch_dtype=torch.bfloat16,
)
pipe.to("cuda")
generator = torch.Generator(device="cuda").manual_seed(42)
image = pipe(
prompt="A golden retriever playing in a sunny garden",
height=1024,
width=1024,
num_inference_steps=[10, 10, 10, 10],
guidance_scale=4.0,
generator=generator,
).images[0]
image.save("demo.png")
```
Load a **variant subfolder** (e.g. `./PixelFlow-256`), not the repo root.
## Load from the Hub
```python
import torch
from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained(
"BiliSakura/PixelFlow-diffusers",
subfolder="PixelFlow-256",
custom_pipeline="pipeline.py",
trust_remote_code=True,
torch_dtype=torch.bfloat16,
)
pipe.to("cuda")
image = pipe(class_labels="golden retriever", num_inference_steps=[10, 10, 10, 10]).images[0]
```
Swap `subfolder="PixelFlow-T2I"` and call with `prompt=...` for text-to-image.
## Conversion
```bash
python scripts/convert_pixelflow_to_diffusers.py \
--checkpoint models/raw/PixelFlow/c2i/model.pt \
--config models/raw/PixelFlow/c2i/config.yaml \
--output models/BiliSakura/PixelFlow-diffusers/PixelFlow-256
python scripts/convert_pixelflow_to_diffusers.py \
--checkpoint models/raw/PixelFlow/t2i/model.pt \
--config models/raw/PixelFlow/t2i/config.yaml \
--output models/BiliSakura/PixelFlow-diffusers/PixelFlow-T2I \
--skip-text-encoder
```
## Citation
```bibtex
@article{chen2025pixelflow,
title={PixelFlow: Pixel-Space Flow Matching for High-Resolution Image Synthesis},
author={Chen, Shoufa and others},
year={2025},
eprint={2504.07963},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
|