Re-host: PiD sdxl 2kto4k decoder (NVIDIA License / non-commercial) — converted from nvidia/PiD (sc-7852)
70b4948 verified | license: other | |
| license_name: nvidia-license | |
| license_link: LICENSE | |
| tags: | |
| - pixel-diffusion | |
| - decoder | |
| - super-resolution | |
| - non-commercial | |
| - sceneworks | |
| extra_gated_prompt: >- | |
| This model is licensed under the NVIDIA License (NSCLv1). It and any | |
| derivative works may be used for NON-COMMERCIAL (research or evaluation) | |
| purposes only. The non-commercial restriction flows to decoded output. | |
| # PiD — sdxl 2kto4k decoder (SceneWorks redistribution) | |
| A format-converted redistribution of NVIDIA's **PiD** (Pixel Diffusion) `sdxl` | |
| 4-step distillation decoder, packaged for SceneWorks' native (no-PyTorch) PiD | |
| decoder (`mlx-gen-pid` on macOS/MLX, `candle-gen-pid` on Windows·Linux/CUDA). | |
| PiD is an **optional, per-generation replacement for the VAE decoder**: it denoises | |
| directly in pixel space and **decodes + 4× super-resolves in one 4-step pass**. This | |
| `sdxl` student serves the **SDXL VAE latent space** (4-channel, the largest | |
| latent→pixel ratio) — SDXL base, RealVisXL (incl. RealVisXL Lightning), and Kolors, | |
| which share the SDXL VAE. | |
| ## ⚠️ License — non-commercial (research/evaluation) only | |
| This checkpoint is under the **NVIDIA License** (HuggingFace tag `NSCLv1`); see | |
| [`LICENSE`](LICENSE) and [`NOTICE`](NOTICE). Per **§3.3**, the Work and any | |
| derivative works may be used **non-commercially — for research or evaluation | |
| purposes only** (NVIDIA and its affiliates excepted). | |
| **This restriction flows to the output.** Images decoded with PiD are for | |
| research/evaluation use only, distinct from the rest of the SceneWorks pipeline. | |
| ## Contents | |
| | File | Description | | |
| |------|-------------| | |
| | `pid_sdxl_2kto4k.safetensors` | sdxl `2kto4k` 4-step student backbone + sigma-aware LQ adapter (456 tensors, bf16, ~2.7 GB). | | |
| | `LICENSE` | Full NVIDIA License text (verbatim from the original release). | | |
| | `NOTICE` | NVIDIA attribution, provenance, and the exact conversion performed. | | |
| The PiD caption encoder (`gemma-2-2b-it`) and the SDXL VAE are provisioned | |
| separately by SceneWorks; this repo holds only the PiD student weights. | |
| ## Provenance / conversion | |
| Converted from NVIDIA's original | |
| `checkpoints/PiD_res2kto4k_sr4x_official_sdxl_distill_4step/model_ema_bf16.pth` | |
| (from [`nvidia/PiD`](https://huggingface.co/nvidia/PiD)) via a lossless | |
| key/format transform (`net.`-prefix strip + drop training-only | |
| `net_ema.*`/`fake_score.*`/`discriminator.*`; dtype preserved). No re-training. | |
| - Project: https://research.nvidia.com/labs/sil/projects/pid/ | |
| - Code: https://github.com/nv-tlabs/pid | |
| - Paper: arXiv:2605.23902 | |