PixelFlow-diffusers / README.md
BiliSakura's picture
Upload folder using huggingface_hub
a29a9fd verified
---
license: mit
library_name: diffusers
pipeline_tag: text-to-image
tags:
- diffusers
- image-generation
- class-conditional
- imagenet
- pixelflow
- flow-matching
widget:
- text: golden retriever
output:
url: PixelFlow-256/demo.png
language:
- en
---
# BiliSakura/PixelFlow-diffusers
Self-contained PixelFlow checkpoints for Hugging Face diffusers. Each variant folder ships its own `pipeline.py`, component modules, and weights.
## Available checkpoints
| Subfolder | Pipeline | Task | Resolution | Params |
| --- | --- | --- | ---: | ---: |
| [`PixelFlow-256/`](PixelFlow-256/) | `PixelFlowPipeline` | class-to-image | 256Γ—256 | 677M |
| [`PixelFlow-T2I/`](PixelFlow-T2I/) | `PixelFlowT2IPipeline` | text-to-image | 1024Γ—1024 | 882M |
## Repo layout
```text
BiliSakura/PixelFlow-diffusers/
β”œβ”€β”€ README.md
β”œβ”€β”€ PixelFlow-256/
β”‚ β”œβ”€β”€ pipeline.py
β”‚ β”œβ”€β”€ model_index.json
β”‚ β”œβ”€β”€ scheduler/scheduler_config.json
β”‚ └── transformer/
└── PixelFlow-T2I/
β”œβ”€β”€ pipeline.py
β”œβ”€β”€ model_index.json
β”œβ”€β”€ scheduler/scheduler_config.json
β”œβ”€β”€ text_encoder/
β”œβ”€β”€ tokenizer/
└── transformer/
```
Each variant is self-contained. The `scheduler/` folder contains `scheduler_config.json` and `scheduling_pixelflow.py` with [`PixelFlowScheduler`](PixelFlow-256/scheduler/scheduling_pixelflow.py).
No shared helper modules at inference time; only PyPI `diffusers` plus the local variant directory.
## ImageNet class labels
For class-conditional [`PixelFlow-256/`](PixelFlow-256/), `id2label` is embedded in `PixelFlow-256/model_index.json` (DiT-style).
- `pipe.id2label` β€” inspect id β†’ English label correspondence
- `pipe.labels` β€” reverse map (English synonym β†’ id)
- `pipe.get_label_ids("golden retriever")`
- `pipe(class_labels="golden retriever", ...)` β€” string labels resolved automatically
## Demo
![PixelFlow-256 demo](PixelFlow-256/demo.png)
Class 207 β€” golden retriever, 256Γ—256, 40 steps (`[10, 10, 10, 10]`).
Class-to-image:
```bash
python demo_inference_c2i.py
```
Text-to-image:
```bash
python demo_inference_t2i.py
```
## Load from a local clone
### Class-to-image (`PixelFlow-256`)
```python
from pathlib import Path
import torch
from diffusers import DiffusionPipeline
model_dir = Path("./PixelFlow-256").resolve()
pipe = DiffusionPipeline.from_pretrained(
str(model_dir),
local_files_only=True,
custom_pipeline=str(model_dir / "pipeline.py"),
trust_remote_code=True,
torch_dtype=torch.bfloat16,
)
pipe.to("cuda")
print(pipe.id2label[207])
print(pipe.get_label_ids("golden retriever"))
generator = torch.Generator(device="cuda").manual_seed(42)
image = pipe(
class_labels="golden retriever",
height=256,
width=256,
num_inference_steps=[10, 10, 10, 10],
guidance_scale=4.0,
generator=generator,
).images[0]
image.save("demo.png")
```
### Text-to-image (`PixelFlow-T2I`)
Uses [`google/flan-t5-xl`](https://huggingface.co/google/flan-t5-xl) when `text_encoder/` is not bundled.
```python
from pathlib import Path
import torch
from diffusers import DiffusionPipeline
model_dir = Path("./PixelFlow-T2I").resolve()
pipe = DiffusionPipeline.from_pretrained(
str(model_dir),
local_files_only=True,
custom_pipeline=str(model_dir / "pipeline.py"),
trust_remote_code=True,
torch_dtype=torch.bfloat16,
)
pipe.to("cuda")
generator = torch.Generator(device="cuda").manual_seed(42)
image = pipe(
prompt="A golden retriever playing in a sunny garden",
height=1024,
width=1024,
num_inference_steps=[10, 10, 10, 10],
guidance_scale=4.0,
generator=generator,
).images[0]
image.save("demo.png")
```