# BiliSakura/PixelFlow-diffusers Self-contained PixelFlow checkpoints for Hugging Face diffusers. Each variant folder ships its own `pipeline.py`, component modules, and weights. ## Available checkpoints | Subfolder | Pipeline | Task | Resolution | Params | | --- | --- | --- | ---: | ---: | | [`PixelFlow-256/`](PixelFlow-256/) | `PixelFlowPipeline` | class-to-image | 256×256 | 677M | | [`PixelFlow-T2I/`](PixelFlow-T2I/) | `PixelFlowT2IPipeline` | text-to-image | 1024×1024 | 882M | ## Repo layout ```text BiliSakura/PixelFlow-diffusers/ ├── README.md ├── PixelFlow-256/ │ ├── pipeline.py │ ├── model_index.json │ ├── scheduler/scheduler_config.json │ └── transformer/ └── PixelFlow-T2I/ ├── pipeline.py ├── model_index.json ├── scheduler/scheduler_config.json ├── text_encoder/ ├── tokenizer/ └── transformer/ ``` Each variant is self-contained. The `scheduler/` folder contains `scheduler_config.json` and `scheduling_pixelflow.py` with [`PixelFlowScheduler`](PixelFlow-256/scheduler/scheduling_pixelflow.py). No shared helper modules at inference time; only PyPI `diffusers` plus the local variant directory. ## ImageNet class labels For class-conditional [`PixelFlow-256/`](PixelFlow-256/), `id2label` is embedded in `PixelFlow-256/model_index.json` (DiT-style). - `pipe.id2label` — inspect id → English label correspondence - `pipe.labels` — reverse map (English synonym → id) - `pipe.get_label_ids("golden retriever")` - `pipe(class_labels="golden retriever", ...)` — string labels resolved automatically ## Demo Class-to-image: ```bash python demo_inference_c2i.py ``` Text-to-image: ```bash python demo_inference_t2i.py ``` ## Load from a local clone ### Class-to-image (`PixelFlow-256`) ```python from pathlib import Path import torch from diffusers import DiffusionPipeline model_dir = Path("./PixelFlow-256").resolve() pipe = DiffusionPipeline.from_pretrained( str(model_dir), local_files_only=True, custom_pipeline=str(model_dir / "pipeline.py"), trust_remote_code=True, torch_dtype=torch.bfloat16, ) pipe.to("cuda") print(pipe.id2label[207]) print(pipe.get_label_ids("golden retriever")) generator = torch.Generator(device="cuda").manual_seed(42) image = pipe( class_labels="golden retriever", height=256, width=256, num_inference_steps=[10, 10, 10, 10], guidance_scale=4.0, generator=generator, ).images[0] image.save("demo.png") ``` ### Text-to-image (`PixelFlow-T2I`) Uses [`google/flan-t5-xl`](https://huggingface.co/google/flan-t5-xl) when `text_encoder/` is not bundled. ```python from pathlib import Path import torch from diffusers import DiffusionPipeline model_dir = Path("./PixelFlow-T2I").resolve() pipe = DiffusionPipeline.from_pretrained( str(model_dir), local_files_only=True, custom_pipeline=str(model_dir / "pipeline.py"), trust_remote_code=True, torch_dtype=torch.bfloat16, ) pipe.to("cuda") generator = torch.Generator(device="cuda").manual_seed(42) image = pipe( prompt="A golden retriever playing in a sunny garden", height=1024, width=1024, num_inference_steps=[10, 10, 10, 10], guidance_scale=4.0, generator=generator, ).images[0] image.save("demo.png") ``` Load a **variant subfolder** (e.g. `./PixelFlow-256`), not the repo root. ## Load from the Hub ```python import torch from diffusers import DiffusionPipeline pipe = DiffusionPipeline.from_pretrained( "BiliSakura/PixelFlow-diffusers", subfolder="PixelFlow-256", custom_pipeline="pipeline.py", trust_remote_code=True, torch_dtype=torch.bfloat16, ) pipe.to("cuda") image = pipe(class_labels="golden retriever", num_inference_steps=[10, 10, 10, 10]).images[0] ``` Swap `subfolder="PixelFlow-T2I"` and call with `prompt=...` for text-to-image. ## Conversion ```bash python scripts/convert_pixelflow_to_diffusers.py \ --checkpoint models/raw/PixelFlow/c2i/model.pt \ --config models/raw/PixelFlow/c2i/config.yaml \ --output models/BiliSakura/PixelFlow-diffusers/PixelFlow-256 python scripts/convert_pixelflow_to_diffusers.py \ --checkpoint models/raw/PixelFlow/t2i/model.pt \ --config models/raw/PixelFlow/t2i/config.yaml \ --output models/BiliSakura/PixelFlow-diffusers/PixelFlow-T2I \ --skip-text-encoder ``` ## Citation ```bibtex @article{chen2025pixelflow, title={PixelFlow: Pixel-Space Flow Matching for High-Resolution Image Synthesis}, author={Chen, Shoufa and others}, year={2025}, eprint={2504.07963}, archivePrefix={arXiv}, primaryClass={cs.CV} } ```