Upload folder using huggingface_hub

098ef8f verified 1 day ago

4.65 kB

	# BiliSakura/PixelFlow-diffusers

	Self-contained PixelFlow checkpoints for Hugging Face diffusers. Each variant folder ships its own `pipeline.py`, component modules, and weights.

	## Available checkpoints

	\| Subfolder \| Pipeline \| Task \| Resolution \| Params \|
	\| --- \| --- \| --- \| ---: \| ---: \|
	\| [`PixelFlow-256/`](PixelFlow-256/) \| `PixelFlowPipeline` \| class-to-image \| 256×256 \| 677M \|
	\| [`PixelFlow-T2I/`](PixelFlow-T2I/) \| `PixelFlowT2IPipeline` \| text-to-image \| 1024×1024 \| 882M \|

	## Repo layout

	```text
	BiliSakura/PixelFlow-diffusers/
	├── README.md
	├── PixelFlow-256/
	│ ├── pipeline.py
	│ ├── model_index.json
	│ ├── scheduler/scheduler_config.json
	│ └── transformer/
	└── PixelFlow-T2I/
	├── pipeline.py
	├── model_index.json
	├── scheduler/scheduler_config.json
	├── text_encoder/
	├── tokenizer/
	└── transformer/
	```

	Each variant is self-contained. The `scheduler/` folder contains `scheduler_config.json` and `scheduling_pixelflow.py` with [`PixelFlowScheduler`](PixelFlow-256/scheduler/scheduling_pixelflow.py).

	No shared helper modules at inference time; only PyPI `diffusers` plus the local variant directory.

	## ImageNet class labels

	For class-conditional [`PixelFlow-256/`](PixelFlow-256/), `id2label` is embedded in `PixelFlow-256/model_index.json` (DiT-style).

	- `pipe.id2label` — inspect id → English label correspondence
	- `pipe.labels` — reverse map (English synonym → id)
	- `pipe.get_label_ids("golden retriever")`
	- `pipe(class_labels="golden retriever", ...)` — string labels resolved automatically

	## Demo

	Class-to-image:

	```bash
	python demo_inference_c2i.py
	```

	Text-to-image:

	```bash
	python demo_inference_t2i.py
	```

	## Load from a local clone

	### Class-to-image (`PixelFlow-256`)

	```python
	from pathlib import Path
	import torch
	from diffusers import DiffusionPipeline

	model_dir = Path("./PixelFlow-256").resolve()
	pipe = DiffusionPipeline.from_pretrained(
	str(model_dir),
	local_files_only=True,
	custom_pipeline=str(model_dir / "pipeline.py"),
	trust_remote_code=True,
	torch_dtype=torch.bfloat16,
	)
	pipe.to("cuda")

	print(pipe.id2label[207])
	print(pipe.get_label_ids("golden retriever"))

	generator = torch.Generator(device="cuda").manual_seed(42)
	image = pipe(
	class_labels="golden retriever",
	height=256,
	width=256,
	num_inference_steps=[10, 10, 10, 10],
	guidance_scale=4.0,
	generator=generator,
	).images[0]
	image.save("demo.png")
	```

	### Text-to-image (`PixelFlow-T2I`)

	Uses [`google/flan-t5-xl`](https://huggingface.co/google/flan-t5-xl) when `text_encoder/` is not bundled.

	```python
	from pathlib import Path
	import torch
	from diffusers import DiffusionPipeline

	model_dir = Path("./PixelFlow-T2I").resolve()
	pipe = DiffusionPipeline.from_pretrained(
	str(model_dir),
	local_files_only=True,
	custom_pipeline=str(model_dir / "pipeline.py"),
	trust_remote_code=True,
	torch_dtype=torch.bfloat16,
	)
	pipe.to("cuda")

	generator = torch.Generator(device="cuda").manual_seed(42)
	image = pipe(
	prompt="A golden retriever playing in a sunny garden",
	height=1024,
	width=1024,
	num_inference_steps=[10, 10, 10, 10],
	guidance_scale=4.0,
	generator=generator,
	).images[0]
	image.save("demo.png")
	```

	Load a variant subfolder (e.g. `./PixelFlow-256`), not the repo root.

	## Load from the Hub

	```python
	import torch
	from diffusers import DiffusionPipeline

	pipe = DiffusionPipeline.from_pretrained(
	"BiliSakura/PixelFlow-diffusers",
	subfolder="PixelFlow-256",
	custom_pipeline="pipeline.py",
	trust_remote_code=True,
	torch_dtype=torch.bfloat16,
	)
	pipe.to("cuda")

	image = pipe(class_labels="golden retriever", num_inference_steps=[10, 10, 10, 10]).images[0]
	```

	Swap `subfolder="PixelFlow-T2I"` and call with `prompt=...` for text-to-image.

	## Conversion

	```bash
	python scripts/convert_pixelflow_to_diffusers.py \
	--checkpoint models/raw/PixelFlow/c2i/model.pt \
	--config models/raw/PixelFlow/c2i/config.yaml \
	--output models/BiliSakura/PixelFlow-diffusers/PixelFlow-256

	python scripts/convert_pixelflow_to_diffusers.py \
	--checkpoint models/raw/PixelFlow/t2i/model.pt \
	--config models/raw/PixelFlow/t2i/config.yaml \
	--output models/BiliSakura/PixelFlow-diffusers/PixelFlow-T2I \
	--skip-text-encoder
	```

	## Citation

	```bibtex
	@article{chen2025pixelflow,
	title={PixelFlow: Pixel-Space Flow Matching for High-Resolution Image Synthesis},
	author={Chen, Shoufa and others},
	year={2025},
	eprint={2504.07963},
	archivePrefix={arXiv},
	primaryClass={cs.CV}
	}
	```