NiT-diffusers / NiT-L /README.md
BiliSakura's picture
Upload folder using huggingface_hub
3d7e8b9 verified
---
license: apache-2.0
library_name: diffusers
pipeline_tag: unconditional-image-generation
tags:
- diffusers
- nit
- image-generation
- class-conditional
- imagenet
inference: true
---
# NiT-L
Self-contained Diffusers checkpoint for **NiT-L** (457M), converted from [`GoodEnough/NiT-L-Models`](https://huggingface.co/GoodEnough/NiT-L-Models) (`model_500K.safetensors`, 500K training steps).
Architecture and training settings follow the official [`nit_l_pack_merge_radio_16384.yaml`](https://github.com/WZDTHU/NiT/blob/main/configs/c2i/nit_l_pack_merge_radio_16384.yaml).
## Model config
| Field | Value |
| --- | --- |
| Parameters | 457M |
| Depth | 24 |
| Hidden size | 1024 |
| Attention heads | 16 |
| Encoder depth | 6 |
| Latent channels (`z_dim`) | 1280 |
| Patch size | 1 |
| Input latent channels | 32 |
| Classes | 1000 |
| Class dropout | 0.1 |
| QK norm | true |
| VAE | `mit-han-lab/dc-ae-f32c32-sana-1.1-diffusers` |
| Flow path type | linear |
## Recommended inference (512×512)
Official NiT sampling defaults for **512×512** class-conditional ImageNet generation:
| Setting | Value |
| --- | --- |
| Resolution | 512×512 |
| Solver | SDE (Euler–Maruyama) in the official repo |
| Steps (NFE) | 250 |
| CFG scale | 2.05 |
| CFG interval | (0.0, 0.7) |
This Diffusers port uses [`FlowMatchEulerDiscreteScheduler`](https://huggingface.co/docs/diffusers/main/en/api/schedulers/flow_match_euler_discrete) in deterministic ODE mode (`stochastic_sampling=false`). Keep the same step count, CFG scale, and interval as the official recipe.
## Usage
```python
from pathlib import Path
import torch
from diffusers import DiffusionPipeline
model_dir = Path(".")
pipe = DiffusionPipeline.from_pretrained(
str(model_dir),
local_files_only=True,
custom_pipeline=str(model_dir / "pipeline.py"),
trust_remote_code=True,
torch_dtype=torch.bfloat16,
).to("cuda" if torch.cuda.is_available() else "cpu")
generator = torch.Generator(device=pipe.device).manual_seed(42)
image = pipe(
class_labels="golden retriever",
height=512,
width=512,
num_inference_steps=250,
guidance_scale=2.05,
guidance_interval=(0.0, 0.7),
generator=generator,
).images[0]
image.save("demo_512.png")
```
## Components
- `pipeline.py` — custom `NiTPipeline`
- `model_index.json` — pipeline index + ImageNet `id2label`
- `transformer/config.json`
- `transformer/nit_transformer_2d.py`
- `transformer/diffusion_pytorch_model.safetensors`
- `scheduler/scheduler_config.json`
- `vae/config.json`
- `vae/diffusion_pytorch_model.safetensors`