PixelFlow-diffusers / README.md
BiliSakura's picture
Upload folder using huggingface_hub
098ef8f verified
|
raw
history blame
4.65 kB
# BiliSakura/PixelFlow-diffusers
Self-contained PixelFlow checkpoints for Hugging Face diffusers. Each variant folder ships its own `pipeline.py`, component modules, and weights.
## Available checkpoints
| Subfolder | Pipeline | Task | Resolution | Params |
| --- | --- | --- | ---: | ---: |
| [`PixelFlow-256/`](PixelFlow-256/) | `PixelFlowPipeline` | class-to-image | 256Γ—256 | 677M |
| [`PixelFlow-T2I/`](PixelFlow-T2I/) | `PixelFlowT2IPipeline` | text-to-image | 1024Γ—1024 | 882M |
## Repo layout
```text
BiliSakura/PixelFlow-diffusers/
β”œβ”€β”€ README.md
β”œβ”€β”€ PixelFlow-256/
β”‚ β”œβ”€β”€ pipeline.py
β”‚ β”œβ”€β”€ model_index.json
β”‚ β”œβ”€β”€ scheduler/scheduler_config.json
β”‚ └── transformer/
└── PixelFlow-T2I/
β”œβ”€β”€ pipeline.py
β”œβ”€β”€ model_index.json
β”œβ”€β”€ scheduler/scheduler_config.json
β”œβ”€β”€ text_encoder/
β”œβ”€β”€ tokenizer/
└── transformer/
```
Each variant is self-contained. The `scheduler/` folder contains `scheduler_config.json` and `scheduling_pixelflow.py` with [`PixelFlowScheduler`](PixelFlow-256/scheduler/scheduling_pixelflow.py).
No shared helper modules at inference time; only PyPI `diffusers` plus the local variant directory.
## ImageNet class labels
For class-conditional [`PixelFlow-256/`](PixelFlow-256/), `id2label` is embedded in `PixelFlow-256/model_index.json` (DiT-style).
- `pipe.id2label` β€” inspect id β†’ English label correspondence
- `pipe.labels` β€” reverse map (English synonym β†’ id)
- `pipe.get_label_ids("golden retriever")`
- `pipe(class_labels="golden retriever", ...)` β€” string labels resolved automatically
## Demo
Class-to-image:
```bash
python demo_inference_c2i.py
```
Text-to-image:
```bash
python demo_inference_t2i.py
```
## Load from a local clone
### Class-to-image (`PixelFlow-256`)
```python
from pathlib import Path
import torch
from diffusers import DiffusionPipeline
model_dir = Path("./PixelFlow-256").resolve()
pipe = DiffusionPipeline.from_pretrained(
str(model_dir),
local_files_only=True,
custom_pipeline=str(model_dir / "pipeline.py"),
trust_remote_code=True,
torch_dtype=torch.bfloat16,
)
pipe.to("cuda")
print(pipe.id2label[207])
print(pipe.get_label_ids("golden retriever"))
generator = torch.Generator(device="cuda").manual_seed(42)
image = pipe(
class_labels="golden retriever",
height=256,
width=256,
num_inference_steps=[10, 10, 10, 10],
guidance_scale=4.0,
generator=generator,
).images[0]
image.save("demo.png")
```
### Text-to-image (`PixelFlow-T2I`)
Uses [`google/flan-t5-xl`](https://huggingface.co/google/flan-t5-xl) when `text_encoder/` is not bundled.
```python
from pathlib import Path
import torch
from diffusers import DiffusionPipeline
model_dir = Path("./PixelFlow-T2I").resolve()
pipe = DiffusionPipeline.from_pretrained(
str(model_dir),
local_files_only=True,
custom_pipeline=str(model_dir / "pipeline.py"),
trust_remote_code=True,
torch_dtype=torch.bfloat16,
)
pipe.to("cuda")
generator = torch.Generator(device="cuda").manual_seed(42)
image = pipe(
prompt="A golden retriever playing in a sunny garden",
height=1024,
width=1024,
num_inference_steps=[10, 10, 10, 10],
guidance_scale=4.0,
generator=generator,
).images[0]
image.save("demo.png")
```
Load a **variant subfolder** (e.g. `./PixelFlow-256`), not the repo root.
## Load from the Hub
```python
import torch
from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained(
"BiliSakura/PixelFlow-diffusers",
subfolder="PixelFlow-256",
custom_pipeline="pipeline.py",
trust_remote_code=True,
torch_dtype=torch.bfloat16,
)
pipe.to("cuda")
image = pipe(class_labels="golden retriever", num_inference_steps=[10, 10, 10, 10]).images[0]
```
Swap `subfolder="PixelFlow-T2I"` and call with `prompt=...` for text-to-image.
## Conversion
```bash
python scripts/convert_pixelflow_to_diffusers.py \
--checkpoint models/raw/PixelFlow/c2i/model.pt \
--config models/raw/PixelFlow/c2i/config.yaml \
--output models/BiliSakura/PixelFlow-diffusers/PixelFlow-256
python scripts/convert_pixelflow_to_diffusers.py \
--checkpoint models/raw/PixelFlow/t2i/model.pt \
--config models/raw/PixelFlow/t2i/config.yaml \
--output models/BiliSakura/PixelFlow-diffusers/PixelFlow-T2I \
--skip-text-encoder
```
## Citation
```bibtex
@article{chen2025pixelflow,
title={PixelFlow: Pixel-Space Flow Matching for High-Resolution Image Synthesis},
author={Chen, Shoufa and others},
year={2025},
eprint={2504.07963},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```