--- license: mit library_name: diffusers pipeline_tag: text-to-image tags: - diffusers - image-generation - class-conditional - imagenet - pixelflow - flow-matching widget: - text: golden retriever output: url: PixelFlow-256/demo.png language: - en --- # BiliSakura/PixelFlow-diffusers Self-contained PixelFlow checkpoints for Hugging Face diffusers. Each variant folder ships its own `pipeline.py`, component modules, and weights. ## Available checkpoints | Subfolder | Pipeline | Task | Resolution | Params | | --- | --- | --- | ---: | ---: | | [`PixelFlow-256/`](PixelFlow-256/) | `PixelFlowPipeline` | class-to-image | 256×256 | 677M | | [`PixelFlow-T2I/`](PixelFlow-T2I/) | `PixelFlowT2IPipeline` | text-to-image | 1024×1024 | 882M | ## Repo layout ```text BiliSakura/PixelFlow-diffusers/ ├── README.md ├── PixelFlow-256/ │ ├── pipeline.py │ ├── model_index.json │ ├── scheduler/scheduler_config.json │ └── transformer/ └── PixelFlow-T2I/ ├── pipeline.py ├── model_index.json ├── scheduler/scheduler_config.json ├── text_encoder/ ├── tokenizer/ └── transformer/ ``` Each variant is self-contained. The `scheduler/` folder contains `scheduler_config.json` and `scheduling_pixelflow.py` with [`PixelFlowScheduler`](PixelFlow-256/scheduler/scheduling_pixelflow.py). No shared helper modules at inference time; only PyPI `diffusers` plus the local variant directory. ## ImageNet class labels For class-conditional [`PixelFlow-256/`](PixelFlow-256/), `id2label` is embedded in `PixelFlow-256/model_index.json` (DiT-style). - `pipe.id2label` — inspect id → English label correspondence - `pipe.labels` — reverse map (English synonym → id) - `pipe.get_label_ids("golden retriever")` - `pipe(class_labels="golden retriever", ...)` — string labels resolved automatically ## Demo ![PixelFlow-256 demo](PixelFlow-256/demo.png) Class 207 — golden retriever, 256×256, 40 steps (`[10, 10, 10, 10]`). Class-to-image: ```bash python demo_inference_c2i.py ``` Text-to-image: ```bash python demo_inference_t2i.py ``` ## Load from a local clone ### Class-to-image (`PixelFlow-256`) ```python from pathlib import Path import torch from diffusers import DiffusionPipeline model_dir = Path("./PixelFlow-256").resolve() pipe = DiffusionPipeline.from_pretrained( str(model_dir), local_files_only=True, custom_pipeline=str(model_dir / "pipeline.py"), trust_remote_code=True, torch_dtype=torch.bfloat16, ) pipe.to("cuda") print(pipe.id2label[207]) print(pipe.get_label_ids("golden retriever")) generator = torch.Generator(device="cuda").manual_seed(42) image = pipe( class_labels="golden retriever", height=256, width=256, num_inference_steps=[10, 10, 10, 10], guidance_scale=4.0, generator=generator, ).images[0] image.save("demo.png") ``` ### Text-to-image (`PixelFlow-T2I`) Uses [`google/flan-t5-xl`](https://huggingface.co/google/flan-t5-xl) when `text_encoder/` is not bundled. ```python from pathlib import Path import torch from diffusers import DiffusionPipeline model_dir = Path("./PixelFlow-T2I").resolve() pipe = DiffusionPipeline.from_pretrained( str(model_dir), local_files_only=True, custom_pipeline=str(model_dir / "pipeline.py"), trust_remote_code=True, torch_dtype=torch.bfloat16, ) pipe.to("cuda") generator = torch.Generator(device="cuda").manual_seed(42) image = pipe( prompt="A golden retriever playing in a sunny garden", height=1024, width=1024, num_inference_steps=[10, 10, 10, 10], guidance_scale=4.0, generator=generator, ).images[0] image.save("demo.png") ```