pid-sdxl / README.md

Re-host: PiD sdxl 2kto4k decoder (NVIDIA License / non-commercial) — converted from nvidia/PiD (sc-7852)

70b4948 verified 5 days ago

2.56 kB

	---
	license: other
	license_name: nvidia-license
	license_link: LICENSE
	tags:
	- pixel-diffusion
	- decoder
	- super-resolution
	- non-commercial
	- sceneworks
	extra_gated_prompt: >-
	This model is licensed under the NVIDIA License (NSCLv1). It and any
	derivative works may be used for NON-COMMERCIAL (research or evaluation)
	purposes only. The non-commercial restriction flows to decoded output.
	---

	# PiD — sdxl 2kto4k decoder (SceneWorks redistribution)

	A format-converted redistribution of NVIDIA's PiD (Pixel Diffusion) `sdxl`
	4-step distillation decoder, packaged for SceneWorks' native (no-PyTorch) PiD
	decoder (`mlx-gen-pid` on macOS/MLX, `candle-gen-pid` on Windows·Linux/CUDA).

	PiD is an optional, per-generation replacement for the VAE decoder: it denoises
	directly in pixel space and decodes + 4× super-resolves in one 4-step pass. This
	`sdxl` student serves the SDXL VAE latent space (4-channel, the largest
	latent→pixel ratio) — SDXL base, RealVisXL (incl. RealVisXL Lightning), and Kolors,
	which share the SDXL VAE.

	## ⚠️ License — non-commercial (research/evaluation) only

	This checkpoint is under the NVIDIA License (HuggingFace tag `NSCLv1`); see
	[`LICENSE`](LICENSE) and [`NOTICE`](NOTICE). Per §3.3, the Work and any
	derivative works may be used **non-commercially — for research or evaluation
	purposes only** (NVIDIA and its affiliates excepted).

	This restriction flows to the output. Images decoded with PiD are for
	research/evaluation use only, distinct from the rest of the SceneWorks pipeline.

	## Contents

	\| File \| Description \|
	\|------\|-------------\|
	\| `pid_sdxl_2kto4k.safetensors` \| sdxl `2kto4k` 4-step student backbone + sigma-aware LQ adapter (456 tensors, bf16, ~2.7 GB). \|
	\| `LICENSE` \| Full NVIDIA License text (verbatim from the original release). \|
	\| `NOTICE` \| NVIDIA attribution, provenance, and the exact conversion performed. \|

	The PiD caption encoder (`gemma-2-2b-it`) and the SDXL VAE are provisioned
	separately by SceneWorks; this repo holds only the PiD student weights.

	## Provenance / conversion

	Converted from NVIDIA's original
	`checkpoints/PiD_res2kto4k_sr4x_official_sdxl_distill_4step/model_ema_bf16.pth`
	(from [`nvidia/PiD`](https://huggingface.co/nvidia/PiD)) via a lossless
	key/format transform (`net.`-prefix strip + drop training-only
	`net_ema.`/`fake_score.`/`discriminator.*`; dtype preserved). No re-training.

	- Project: https://research.nvidia.com/labs/sil/projects/pid/
	- Code: https://github.com/nv-tlabs/pid
	- Paper: arXiv:2605.23902