FD-Loss-diffusers / README.md
BiliSakura's picture
Upload folder using huggingface_hub
6786303 verified
|
Raw
History Blame Contribute Delete
7 kB
---
license: mit
library_name: diffusers
pipeline_tag: text-to-image
tags:
- diffusers
- fd-loss
- jit
- imf
- pmf
- image-generation
- class-conditional
- imagenet
inference: true
widget:
- output:
url: JiT-H-16-SIM/demo.png
language:
- en
---
# FD-Loss-diffusers
Post-training checkpoints with the same JiT / iMF / pMF architectures as the base models, distilled with FD-loss (feature distillation).
- **JiT**: all `/16` variants for 256px generation
- **iMF**: all variants for 256px generation
- **pMF**: both `/16` and `/32`
- **SIM** suffix: SigLIP + Inception + MAE FD-loss
- Default inference: **1 NFE** (`num_inference_steps=1`)
FD-Loss JiT uses **legacy time convention** and **velocity Euler** via bundled `FDLossFlowMatchScheduler` (`scheduler/scheduling_flow_match_fd.py`; timesteps `t=1→0`).
## Demo
![JiT-H-16-SIM demo](JiT-H-16-SIM/demo.png)
Class-conditional sample (ImageNet class **golden retriever**), `JiT-H/16` FD-SIM at 256×256, 1 NFE, CFG 2.2, interval [0.1, 1.0], seed 42.
## Inference (`JiT-B-16-SIM`)
```python
from pathlib import Path
import torch
from diffusers import DiffusionPipeline
model_dir = Path("./JiT-B-16-SIM")
pipe = DiffusionPipeline.from_pretrained(
str(model_dir),
custom_pipeline=str(model_dir / "pipeline.py"),
trust_remote_code=True,
)
pipe.to("cuda")
generator = torch.Generator(device="cuda").manual_seed(42)
image = pipe(
class_labels="golden retriever",
num_inference_steps=1, # 1 NFE (FD-Loss default)
guidance_scale=3.0,
guidance_interval_min=0.1,
guidance_interval_max=1.0,
generator=generator,
).images[0]
```
| Parameter | JiT-B-16-SIM default | Source |
| --- | --- | --- |
| `num_inference_steps` | `1` | `--num_sampling_steps 1` |
| `guidance_scale` | `3.0` | JiT_B eval preset |
| `guidance_interval_min` / `max` | `0.1` / `1.0` | JiT_B eval preset |
| `legacy_time_convention` | `True` (pipeline default) | `--legacy_time_convention` |
## Inference (`JiT-L-16-SIM`)
Same loading pattern as JiT-B; use **CFG 2.4** (JiT_L eval preset):
```python
model_dir = Path("./JiT-L-16-SIM")
pipe = DiffusionPipeline.from_pretrained(
str(model_dir),
custom_pipeline=str(model_dir / "pipeline.py"),
trust_remote_code=True,
).to("cuda")
image = pipe(
class_labels="golden retriever",
num_inference_steps=1,
guidance_scale=2.4,
generator=torch.Generator(device="cuda").manual_seed(42),
).images[0]
```
## Inference (`JiT-H-16-SIM`)
Same loading pattern as JiT-B/L; use **CFG 2.2** (JiT_H eval preset):
```python
model_dir = Path("./JiT-H-16-SIM")
pipe = DiffusionPipeline.from_pretrained(
str(model_dir),
custom_pipeline=str(model_dir / "pipeline.py"),
trust_remote_code=True,
).to("cuda")
image = pipe(
class_labels="golden retriever",
num_inference_steps=1,
guidance_scale=2.2,
guidance_interval_min=0.1,
guidance_interval_max=1.0,
generator=torch.Generator(device="cuda").manual_seed(42),
).images[0]
```
## Available variants
| Variant | Path | Architecture | Resolution | CFG (1 NFE) |
| --- | --- | --- | --- | --- |
| JiT-B-16-SIM | `./JiT-B-16-SIM` | JiT-B/16 | 256×256 | 3.0 |
| JiT-L-16-SIM | `./JiT-L-16-SIM` | JiT-L/16 | 256×256 | 2.4 |
| JiT-H-16-SIM | `./JiT-H-16-SIM` | JiT-H/16 | 256×256 | 2.2 |
| iMF-B-SIM | `./iMF-B-SIM` | iMF-B/2 | 256×256 | 8.0 |
| iMF-L-SIM | `./iMF-L-SIM` | iMF-L/2 | 256×256 | 8.0 |
| iMF-XL-SIM | `./iMF-XL-SIM` | iMF-XL/2 | 256×256 | 8.0 |
| pMF-B-16-SIM | `./pMF-B-16-SIM` | pMF-B/16 | 256×256 | 7.5 |
| pMF-B-32-SIM | `./pMF-B-32-SIM` | pMF-B/32 | 512×512 | 6.5 |
| pMF-L-16-SIM | `./pMF-L-16-SIM` | pMF-L/16 | 256×256 | 7.0 |
| pMF-L-32-SIM | `./pMF-L-32-SIM` | pMF-L/32 | 512×512 | 7.5 |
| pMF-H-32-SIM | `./pMF-H-32-SIM` | pMF-H/32 | 512×512 | 5.5 |
## Inference (`iMF-B-SIM`)
Uses production `IMFPipeline` from `iMF-diffusers/iMF-B-2` (native iMF time convention, not legacy JiT time). Use **`torch.float32`** (same as base iMF variants):
```python
model_dir = Path("./iMF-B-SIM")
pipe = DiffusionPipeline.from_pretrained(
str(model_dir),
custom_pipeline=str(model_dir / "pipeline.py"),
trust_remote_code=True,
torch_dtype=torch.float32,
).to("cuda")
image = pipe(
class_labels="golden retriever",
num_inference_steps=1,
guidance_scale=8.0,
guidance_interval_start=0.4,
guidance_interval_end=0.65,
generator=torch.Generator(device="cuda").manual_seed(42),
).images[0]
```
## Inference (`iMF-L-SIM`)
Same loading pattern as iMF-B-SIM (production `IMFPipeline` from `iMF-diffusers/iMF-L-2`):
```python
model_dir = Path("./iMF-L-SIM")
pipe = DiffusionPipeline.from_pretrained(
str(model_dir),
custom_pipeline=str(model_dir / "pipeline.py"),
trust_remote_code=True,
torch_dtype=torch.float32,
).to("cuda")
image = pipe(
class_labels="golden retriever",
num_inference_steps=1,
guidance_scale=8.0,
guidance_interval_start=0.4,
guidance_interval_end=0.65,
generator=torch.Generator(device="cuda").manual_seed(42),
).images[0]
```
Regenerate from checkpoint:
```bash
python _convert_imf_l_fd_sim.py
```
## Inference (`iMF-XL-SIM`)
Same loading pattern as iMF-B/L-SIM (production `IMFPipeline` from `iMF-diffusers/iMF-XL-2`):
```python
model_dir = Path("./iMF-XL-SIM")
pipe = DiffusionPipeline.from_pretrained(
str(model_dir),
custom_pipeline=str(model_dir / "pipeline.py"),
trust_remote_code=True,
torch_dtype=torch.float32,
).to("cuda")
image = pipe(
class_labels="golden retriever",
num_inference_steps=1,
guidance_scale=8.0,
guidance_interval_start=0.4,
guidance_interval_end=0.65,
generator=torch.Generator(device="cuda").manual_seed(42),
).images[0]
```
## Inference (`pMF-B-16-SIM`)
Uses production `PMFPipeline` from `pMF-diffusers/pMF-B-16` (native pMF time convention). Use **`torch.bfloat16`** on CUDA:
```python
model_dir = Path("./pMF-B-16-SIM")
pipe = DiffusionPipeline.from_pretrained(
str(model_dir),
custom_pipeline=str(model_dir / "pipeline.py"),
trust_remote_code=True,
torch_dtype=torch.bfloat16,
).to("cuda")
image = pipe(
class_labels="golden retriever",
num_inference_steps=1,
guidance_scale=7.5,
guidance_interval_min=0.1,
guidance_interval_max=0.8,
noise_scale=1.0,
generator=torch.Generator(device="cuda").manual_seed(42),
).images[0]
```
## Inference (`pMF-B-32-SIM`)
Same loading pattern as pMF-B-16-SIM (production `PMFPipeline` from `pMF-diffusers/pMF-B-32`):
```python
model_dir = Path("./pMF-B-32-SIM")
pipe = DiffusionPipeline.from_pretrained(
str(model_dir),
custom_pipeline=str(model_dir / "pipeline.py"),
trust_remote_code=True,
torch_dtype=torch.bfloat16,
).to("cuda")
image = pipe(
class_labels="golden retriever",
num_inference_steps=1,
guidance_scale=6.5,
guidance_interval_min=0.1,
guidance_interval_max=0.7,
noise_scale=2.0,
generator=torch.Generator(device="cuda").manual_seed(42),
).images[0]
```