pokemon-diffusion / README.md
WEIEE's picture
Add model card
69194f6 verified
|
Raw
History Blame Contribute Delete
2.12 kB
---
license: cc-by-nc-4.0
tags:
- diffusion
- ddpm
- pixel-art
- pokemon
- text-to-image
library_name: pytorch
pipeline_tag: unconditional-image-generation
---
# Pokémon Pixel-Art Diffusion — checkpoints
Trained checkpoints for a **from-scratch DDPM** that generates 96×96 Pokémon pixel art
(by name, by free-text description, or by fusing two Pokémon).
**Code, model definitions & demo notebook:** https://github.com/weiee666/Pokemon_Combination
## Checkpoints
| file (`experiments/.../ckpt_ep300.pt`) | conditioning | attention | classes | params |
|---|---|---|---:|---:|
| `exp01_pokemon1070front_condUNet-CFG` | name embedding | – | 1070 | 4.2M |
| `exp02_pokemon20aug_condUNet-CFG-TB` | name embedding | – | 20 | 3.9M |
| `exp03_pokemon20aug_condUNet-CFG-attn-TB` | name embedding | self-attn | 20 | 5.8M |
| `exp04_pokemon20clean_bigUNet-CFG-attn-EMA` | name embedding | self-attn + EMA | 20 | 30.3M |
| `exp05_pokemon20aug_bigUNet-CFG-attn-EMA` | name embedding | self-attn + EMA | 20 | 30.3M |
| `exp06_stage1_pokemonALL_clipText` | CLIP pooled text | – | 988 | 30.5M |
| `exp07_stage1_pokemonALL_xattn` | CLIP per-token | self + cross-attn @24²/12² | 988 | 33.5M |
| `exp08_stage1_pokemonALL_xattn48` | CLIP per-token | self + cross-attn @48²/24²/12² | 988 | 34.0M |
| `exp09_stage1_pokemonALLcentered_xattn48` | CLIP per-token | self + cross-attn @48²/24²/12² | 988 | 34.0M |
`exp06_.../clip_table.pt` = pre-computed CLIP pooled vectors per class (needed by the exp06 model).
The checkpoints store **EMA weights** (for the experiments that use EMA). CLIP-conditioned
models (exp06–09) expect the CLIP ViT-B-32 (`laion2b_s34b_b79k`) text encoder.
## Load
```python
from huggingface_hub import hf_hub_download
import torch
ckpt = hf_hub_download(
"WEIEE/pokemon-diffusion",
"experiments/exp09_stage1_pokemonALLcentered_xattn48/checkpoints/ckpt_ep300.pt",
)
state = torch.load(ckpt, map_location="cpu")
# the matching UNet / Diffusion class definitions live in the GitHub notebooks
```
Educational / research project. Pokémon and all sprites are © Nintendo / Game Freak.