BiliSakura
/

JiT-diffusers

Unconditional Image Generation

image-generation

class-conditional

Model card Files Files and versions

JiT-diffusers / README.md

BiliSakura's picture

Upload folder using huggingface_hub

5673750 verified 10 days ago

|

history blame contribute delete

2.69 kB

	---
	license: mit
	library_name: diffusers
	pipeline_tag: unconditional-image-generation
	tags:
	- diffusers
	- jit
	- image-generation
	- class-conditional
	widget:
	- output:
	url: demo.png
	language:
	- en
	---

	# JiT-diffusers

	Native diffusers implementation of JiT (Just image Transformer). Each variant folder is self-contained:

	- `pipeline.py` — `JiTPipeline`
	- `scheduler/scheduler_config.json` — `FlowMatchHeunDiscreteScheduler` config (default `shift=4.0`)
	- `transformer/jit_transformer_2d.py` — `JiTTransformer2DModel`

	The pipeline now supports dynamic inference resolution in `__call__` with positional interpolation.

	No separate `jit_diffusers` package; only PyPI `diffusers` plus local custom code in the variant directory.

	## Available checkpoints

	\| Checkpoint \| Path \| Resolution \| Recommended CFG \|
	\| --- \| --- \| --- \| --- \|
	\| JiT-B/16 \| `./JiT-B-16` \| 256×256 \| 3.0 \|
	\| JiT-L/16 \| `./JiT-L-16` \| 256×256 \| 2.4 \|
	\| JiT-H/16 \| `./JiT-H-16` \| 256×256 \| 2.2 \|
	\| JiT-B/32 \| `./JiT-B-32` \| 512×512 \| 3.0 \|
	\| JiT-L/32 \| `./JiT-L-32` \| 512×512 \| 2.5 \|
	\| JiT-H/32 \| `./JiT-H-32` \| 512×512 \| 2.3 \|

	## ImageNet class labels

	Each variant keeps an English `id2label` map directly in its own `model_index.json` (DiT-style).

	- `pipe.id2label` — inspect id → English label correspondence
	- `pipe.labels` — reverse map (English synonym → id), sorted for browsing
	- `pipe.get_label_ids("golden retriever")`
	- `pipe(class_labels="golden retriever", ...)` — string labels resolved automatically

	Chinese labels are preserved in the main source repo under `src/labels/id2label_cn.json` for reference.

	## Inference

	Run the bundled demo script from the repo root:

	```bash
	python demo_inference.py
	```

	This writes `demo.png` using `JiT-H-32` with the settings below.

	```python
	from pathlib import Path
	from diffusers import DiffusionPipeline, FlowMatchHeunDiscreteScheduler
	import torch

	model_dir = Path("./JiT-H-32")
	pipe = DiffusionPipeline.from_pretrained(
	str(model_dir),
	custom_pipeline=str(model_dir / "pipeline.py"),
	trust_remote_code=True,
	)
	pipe.scheduler = FlowMatchHeunDiscreteScheduler.from_config(pipe.scheduler.config, shift=4.0)
	pipe.to("cuda")

	# Numeric or human-readable labels
	print(pipe.id2label[207])
	print(pipe.get_label_ids("golden retriever"))

	generator = torch.Generator(device="cuda").manual_seed(42)
	image = pipe(
	class_labels="golden retriever",
	num_inference_steps=50,
	guidance_scale=2.3,
	generator=generator,
	).images[0]
	image.save("demo.png")
	```

	`height` and `width` default to the checkpoint's native resolution when omitted.

	Load a variant subfolder (e.g. `./JiT-H-32`), not the repo root.