Re-host: PiD sdxl 2kto4k decoder (NVIDIA License / non-commercial) — converted from nvidia/PiD (sc-7852)
70b4948 verified | PiD — Pixel Diffusion Decoder (sdxl 2kto4k student) | |
| ==================================================== | |
| This repository redistributes a converted copy of a model checkpoint | |
| originally produced and released by NVIDIA Corporation. | |
| Original work | |
| ------------- | |
| Name: PiD (Pixel Diffusion) — PixelDiT distillation decoders | |
| Author: NVIDIA Corporation and its affiliates (NVIDIA Toronto AI Lab) | |
| Source: https://huggingface.co/nvidia/PiD | |
| Project: https://research.nvidia.com/labs/sil/projects/pid/ | |
| Code: https://github.com/nv-tlabs/pid | |
| Paper: arXiv:2605.23902 | |
| Original checkpoint: | |
| checkpoints/PiD_res2kto4k_sr4x_official_sdxl_distill_4step/model_ema_bf16.pth | |
| Latent space | |
| ------------ | |
| This is the `sdxl` PiD student — the SDXL VAE latent space (4-channel, affine | |
| scale 0.13025 / shift 0.0). In SceneWorks it serves every model in that latent | |
| space: SDXL base, RealVisXL (incl. RealVisXL Lightning), and Kolors (which | |
| reuses the SDXL VAE). This is the variance-preserving (VP-frame) student; the | |
| shipped clean (sigma=0) decode path is frame-agnostic. | |
| What was changed | |
| ---------------- | |
| The original PyTorch checkpoint (`model_ema_bf16.pth`) was converted to | |
| safetensors for SceneWorks' native MLX/candle PiD decoder (`mlx-gen-pid`). The | |
| conversion is a lossless key/format transform only: training-only tensors | |
| (`net_ema.*`, `fake_score.*`, `discriminator.*`) are dropped and the `net.` | |
| prefix is stripped, per the reference inference loader | |
| (`pid_distill_model.py::PidDistillModel.load_state_dict`). Tensor values and | |
| dtype (bfloat16) are unchanged. No re-training or fine-tuning was performed. | |
| License | |
| ------- | |
| This work and the original are licensed under the NVIDIA License (the license | |
| HuggingFace tags as "NSCLv1"). The full license text is in the accompanying | |
| LICENSE file and applies to this redistribution and to any derivative works. | |
| USE LIMITATION (NVIDIA License §3.3): The Work and any derivative works thereof | |
| may only be used, or be intended for use, NON-COMMERCIALLY — i.e. for RESEARCH | |
| OR EVALUATION PURPOSES ONLY. (NVIDIA Corporation and its affiliates may use the | |
| Work commercially.) | |
| This non-commercial restriction FLOWS TO THE OUTPUT: images decoded with this | |
| PiD decoder are for research/evaluation use only and are distinct in that | |
| respect from images produced by the rest of the SceneWorks pipeline. | |
| Per NVIDIA License §3.1, this distribution (a) is under the same license, | |
| (b) includes a complete copy of it, and (c) retains all copyright/patent/ | |
| trademark/attribution notices present in the original Work. NVIDIA's name, | |
| logos, and trademarks are not used except as necessary to reproduce these | |
| notices (NVIDIA License §3.5). | |