Unconditional Image Generation
Diffusers
Safetensors
English
NiTPipeline
image-generation
class-conditional
nit
Instructions to use BiliSakura/NiT-XL-diffusers with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use BiliSakura/NiT-XL-diffusers with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("BiliSakura/NiT-XL-diffusers", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
| license: apache-2.0 | |
| language: | |
| - en | |
| library_name: diffusers | |
| tags: | |
| - diffusers | |
| - image-generation | |
| - class-conditional | |
| - nit | |
| pipeline_tag: unconditional-image-generation | |
| widget: | |
| - output: | |
| url: demo_images/demo_sde250_class207_seed42.png | |
| # NiT-XL Diffusers (Class-Conditional) | |
| Native-resolution Image Transformer (NiT-XL) checkpoint packaged as a Diffusers-style repository with vendored custom code. | |
| ## What is included | |
| - `transformer/`: `NiTTransformer2DModel` weights + config | |
| - `scheduler/`: `NiTFlowMatchScheduler` config | |
| - `vae/`: `AutoencoderDC` weights + config | |
| - `custom_pipeline/`: local, self-contained implementation for: | |
| - `NiTPipeline` | |
| - `NiTTransformer2DModel` | |
| - `NiTFlowMatchScheduler` | |
| - `test_inference.py`: standalone sampling script | |
| This repository does **not** depend on an external `NiT-diffusers` checkout during inference. | |
| It includes a root `pipeline.py` custom entrypoint for Diffusers dynamic loading. | |
| ## Quickstart | |
| ### 1) Environment | |
| Install dependencies (example): | |
| ```bash | |
| pip install torch diffusers safetensors | |
| ``` | |
| If using this project environment: | |
| ```bash | |
| conda activate rsgen | |
| ``` | |
| ### 2) Generate a demo image | |
| Run from this repository root: | |
| ```bash | |
| python test_inference.py \ | |
| --class-label 207 \ | |
| --height 512 \ | |
| --width 512 \ | |
| --steps 250 \ | |
| --mode sde \ | |
| --guidance-scale 2.05 \ | |
| --guidance-low 0.0 \ | |
| --guidance-high 0.7 \ | |
| --output demo_images/demo_sde250_class207_seed42.png | |
| ``` | |
| ## Python usage | |
| ```python | |
| from pathlib import Path | |
| import torch | |
| from diffusers import DiffusionPipeline | |
| model_dir = Path(".").resolve() | |
| device = "cuda" if torch.cuda.is_available() else "cpu" | |
| dtype = torch.bfloat16 if device == "cuda" and torch.cuda.is_bf16_supported() else torch.float32 | |
| pipe = DiffusionPipeline.from_pretrained( | |
| model_dir, | |
| custom_pipeline=str(model_dir / "pipeline.py"), | |
| local_files_only=True, | |
| ).to(device) | |
| if device == "cuda": | |
| pipe.transformer.to(dtype=dtype) | |
| pipe.vae.to(dtype=dtype) | |
| gen = torch.Generator(device=device).manual_seed(42) | |
| result = pipe( | |
| class_labels=[207], | |
| height=512, | |
| width=512, | |
| num_inference_steps=250, | |
| mode="sde", | |
| guidance_scale=2.05, | |
| guidance_interval=(0.0, 0.7), | |
| generator=gen, | |
| ) | |
| result.images[0].save("demo_images/sample.png") | |
| ``` | |
| For remote Hub loading: | |
| ```python | |
| from diffusers import DiffusionPipeline | |
| pipe = DiffusionPipeline.from_pretrained( | |
| "BiliSakura/NiT-XL-diffusers", | |
| custom_pipeline="pipeline", | |
| ) | |
| ``` | |
| ## Recommended inference settings | |
| - Resolution: `512x512` | |
| - Mode: `sde` | |
| - Steps: `250` | |
| - Guidance scale: `2.05` | |
| - Guidance interval: `(0.0, 0.7)` | |
| Using very low steps (for example `2`) is only a smoke test and will produce low-quality images. | |
| ## Demo | |
|  | |
| ## Citation | |
| If you use this model or the NiT method in your work, please cite: | |
| ```bibtex | |
| @article{wang2025native, | |
| title={Native-Resolution Image Synthesis}, | |
| author={Wang, Zidong and Bai, Lei and Yue, Xiangyu and Ouyang, Wanli and Zhang, Yiyuan}, | |
| year={2025}, | |
| eprint={2506.03131}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.CV} | |
| } | |
| ``` | |
| ## Notes | |
| - This is a class-conditional generator (ImageNet label ids), not a text-to-image model. | |
| - For reproducibility, set `--seed`. | |
| - The vendored custom pipeline keeps inference behavior consistent without external code dependencies. | |