PiD — Pixel Diffusion Decoder (sdxl 2kto4k student) ==================================================== This repository redistributes a converted copy of a model checkpoint originally produced and released by NVIDIA Corporation. Original work ------------- Name: PiD (Pixel Diffusion) — PixelDiT distillation decoders Author: NVIDIA Corporation and its affiliates (NVIDIA Toronto AI Lab) Source: https://huggingface.co/nvidia/PiD Project: https://research.nvidia.com/labs/sil/projects/pid/ Code: https://github.com/nv-tlabs/pid Paper: arXiv:2605.23902 Original checkpoint: checkpoints/PiD_res2kto4k_sr4x_official_sdxl_distill_4step/model_ema_bf16.pth Latent space ------------ This is the `sdxl` PiD student — the SDXL VAE latent space (4-channel, affine scale 0.13025 / shift 0.0). In SceneWorks it serves every model in that latent space: SDXL base, RealVisXL (incl. RealVisXL Lightning), and Kolors (which reuses the SDXL VAE). This is the variance-preserving (VP-frame) student; the shipped clean (sigma=0) decode path is frame-agnostic. What was changed ---------------- The original PyTorch checkpoint (`model_ema_bf16.pth`) was converted to safetensors for SceneWorks' native MLX/candle PiD decoder (`mlx-gen-pid`). The conversion is a lossless key/format transform only: training-only tensors (`net_ema.*`, `fake_score.*`, `discriminator.*`) are dropped and the `net.` prefix is stripped, per the reference inference loader (`pid_distill_model.py::PidDistillModel.load_state_dict`). Tensor values and dtype (bfloat16) are unchanged. No re-training or fine-tuning was performed. License ------- This work and the original are licensed under the NVIDIA License (the license HuggingFace tags as "NSCLv1"). The full license text is in the accompanying LICENSE file and applies to this redistribution and to any derivative works. USE LIMITATION (NVIDIA License §3.3): The Work and any derivative works thereof may only be used, or be intended for use, NON-COMMERCIALLY — i.e. for RESEARCH OR EVALUATION PURPOSES ONLY. (NVIDIA Corporation and its affiliates may use the Work commercially.) This non-commercial restriction FLOWS TO THE OUTPUT: images decoded with this PiD decoder are for research/evaluation use only and are distinct in that respect from images produced by the rest of the SceneWorks pipeline. Per NVIDIA License §3.1, this distribution (a) is under the same license, (b) includes a complete copy of it, and (c) retains all copyright/patent/ trademark/attribution notices present in the original Work. NVIDIA's name, logos, and trademarks are not used except as necessary to reproduce these notices (NVIDIA License §3.5).