| --- |
| license: cc-by-nc-4.0 |
| tags: |
| - diffusion |
| - ddpm |
| - pixel-art |
| - pokemon |
| - text-to-image |
| library_name: pytorch |
| pipeline_tag: unconditional-image-generation |
| --- |
| |
| # Pokémon Pixel-Art Diffusion — checkpoints |
|
|
| Trained checkpoints for a **from-scratch DDPM** that generates 96×96 Pokémon pixel art |
| (by name, by free-text description, or by fusing two Pokémon). |
|
|
| **Code, model definitions & demo notebook:** https://github.com/weiee666/Pokemon_Combination |
| |
| ## Checkpoints |
| |
| | file (`experiments/.../ckpt_ep300.pt`) | conditioning | attention | classes | params | |
| |---|---|---|---:|---:| |
| | `exp01_pokemon1070front_condUNet-CFG` | name embedding | – | 1070 | 4.2M | |
| | `exp02_pokemon20aug_condUNet-CFG-TB` | name embedding | – | 20 | 3.9M | |
| | `exp03_pokemon20aug_condUNet-CFG-attn-TB` | name embedding | self-attn | 20 | 5.8M | |
| | `exp04_pokemon20clean_bigUNet-CFG-attn-EMA` | name embedding | self-attn + EMA | 20 | 30.3M | |
| | `exp05_pokemon20aug_bigUNet-CFG-attn-EMA` | name embedding | self-attn + EMA | 20 | 30.3M | |
| | `exp06_stage1_pokemonALL_clipText` | CLIP pooled text | – | 988 | 30.5M | |
| | `exp07_stage1_pokemonALL_xattn` | CLIP per-token | self + cross-attn @24²/12² | 988 | 33.5M | |
| | `exp08_stage1_pokemonALL_xattn48` | CLIP per-token | self + cross-attn @48²/24²/12² | 988 | 34.0M | |
| | `exp09_stage1_pokemonALLcentered_xattn48` | CLIP per-token | self + cross-attn @48²/24²/12² | 988 | 34.0M | |
|
|
| `exp06_.../clip_table.pt` = pre-computed CLIP pooled vectors per class (needed by the exp06 model). |
|
|
| The checkpoints store **EMA weights** (for the experiments that use EMA). CLIP-conditioned |
| models (exp06–09) expect the CLIP ViT-B-32 (`laion2b_s34b_b79k`) text encoder. |
|
|
| ## Load |
|
|
| ```python |
| from huggingface_hub import hf_hub_download |
| import torch |
| |
| ckpt = hf_hub_download( |
| "WEIEE/pokemon-diffusion", |
| "experiments/exp09_stage1_pokemonALLcentered_xattn48/checkpoints/ckpt_ep300.pt", |
| ) |
| state = torch.load(ckpt, map_location="cpu") |
| # the matching UNet / Diffusion class definitions live in the GitHub notebooks |
| ``` |
|
|
| Educational / research project. Pokémon and all sprites are © Nintendo / Game Freak. |
|
|