File size: 2,589 Bytes
3d7e8b9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
---
license: apache-2.0
library_name: diffusers
pipeline_tag: unconditional-image-generation
tags:
  - diffusers
  - nit
  - image-generation
  - class-conditional
  - imagenet
inference: true
---

# NiT-B

Self-contained Diffusers checkpoint for **NiT-B** (131M), converted from [`GoodEnough/NiT-B-Models`](https://huggingface.co/GoodEnough/NiT-B-Models) (`model_500K.safetensors`, 500K training steps).

Architecture and training settings follow the official [`nit_b_pack_merge_radio_65536.yaml`](https://github.com/WZDTHU/NiT/blob/main/configs/c2i/nit_b_pack_merge_radio_65536.yaml).

## Model config

| Field | Value |
| --- | --- |
| Parameters | 131M |
| Depth | 12 |
| Hidden size | 768 |
| Attention heads | 12 |
| Encoder depth | 4 |
| Latent channels (`z_dim`) | 1280 |
| Patch size | 1 |
| Input latent channels | 32 |
| Classes | 1000 |
| Class dropout | 0.1 |
| QK norm | true |
| VAE | `mit-han-lab/dc-ae-f32c32-sana-1.1-diffusers` |
| Flow path type | linear |

## Recommended inference (256×256)

Official NiT sampling defaults for **256×256** class-conditional ImageNet generation:

| Setting | Value |
| --- | --- |
| Resolution | 256×256 |
| Solver | SDE (Euler–Maruyama) in the official repo |
| Steps (NFE) | 250 |
| CFG scale | 2.25 |
| CFG interval | (0.0, 0.7) |

This Diffusers port uses [`FlowMatchEulerDiscreteScheduler`](https://huggingface.co/docs/diffusers/main/en/api/schedulers/flow_match_euler_discrete) in deterministic ODE mode (`stochastic_sampling=false`). Keep the same step count, CFG scale, and interval as the official recipe.

## Usage

```python
from pathlib import Path
import torch
from diffusers import DiffusionPipeline

model_dir = Path(".")
pipe = DiffusionPipeline.from_pretrained(
    str(model_dir),
    local_files_only=True,
    custom_pipeline=str(model_dir / "pipeline.py"),
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
).to("cuda" if torch.cuda.is_available() else "cpu")

generator = torch.Generator(device=pipe.device).manual_seed(42)
image = pipe(
    class_labels="golden retriever",
    height=256,
    width=256,
    num_inference_steps=250,
    guidance_scale=2.25,
    guidance_interval=(0.0, 0.7),
    generator=generator,
).images[0]
image.save("demo_256.png")
```

## Components

- `pipeline.py` — custom `NiTPipeline`
- `model_index.json` — pipeline index + ImageNet `id2label`
- `transformer/config.json`
- `transformer/nit_transformer_2d.py`
- `transformer/diffusion_pytorch_model.safetensors`
- `scheduler/scheduler_config.json`
- `vae/config.json`
- `vae/diffusion_pytorch_model.safetensors`