| --- |
| license: other |
| license_name: fair-1.0.0 |
| license_link: LICENSE |
| library_name: pytorch |
| tags: |
| - image-generation |
| - pixel-art |
| - sprites |
| - flow-matching |
| - diffusion |
| - text-to-image |
| - game-assets |
| pipeline_tag: text-to-image |
| --- |
| |
| # Alucard |
|
|
| A small (32M parameter) text-to-sprite generative model using flow matching. Generates 128x128 RGBA sprites from text prompts, with optional reference frame input for animation generation. |
|
|
| **GitHub**: [evilsocket/alucard](https://github.com/evilsocket/alucard) |
|
|
| ## Installation |
|
|
| ```bash |
| pip install git+https://github.com/evilsocket/alucard.git |
| ``` |
|
|
| ## Usage |
|
|
| ### Generate a sprite from text |
|
|
| ```python |
| from alucard import Alucard |
| |
| # Load model (downloads weights automatically from HuggingFace) |
| model = Alucard.from_pretrained("evilsocket/alucard") |
| |
| # Generate a sprite |
| sprite = model("a pixel art knight sprite, idle pose") |
| sprite.save("knight.png") |
| |
| # Generate multiple variations |
| sprites = model("a pixel art dragon enemy sprite", num_samples=4, seed=42) |
| for i, s in enumerate(sprites): |
| s.save(f"dragon_{i}.png") |
| ``` |
|
|
| ### Generate an animation sequence |
|
|
| Use the `ref` parameter to condition generation on a previous frame: |
|
|
| ```python |
| from alucard import Alucard |
| |
| model = Alucard.from_pretrained("evilsocket/alucard") |
| |
| # Generate the first frame |
| frame_1 = model("a pixel art knight sprite, walking right, frame 1") |
| frame_1.save("walk_01.png") |
| |
| # Generate subsequent frames by passing the previous frame as reference |
| frame_2 = model("a pixel art knight sprite, walking right, frame 2", ref=frame_1) |
| frame_2.save("walk_02.png") |
| |
| frame_3 = model("a pixel art knight sprite, walking right, frame 3", ref=frame_2) |
| frame_3.save("walk_03.png") |
| |
| frame_4 = model("a pixel art knight sprite, walking right, frame 4", ref=frame_3) |
| frame_4.save("walk_04.png") |
| ``` |
|
|
| You can also pass a file path as `ref`: |
|
|
| ```python |
| sprite = model("a pixel art knight sprite, attack pose", ref="walk_01.png") |
| ``` |
|
|
| ### Generation parameters |
|
|
| ```python |
| sprite = model( |
| "a pixel art wizard sprite", |
| num_samples=1, # number of images to generate |
| num_steps=20, # Euler ODE steps (more = better quality, slower) |
| cfg_text=5.0, # text guidance scale (higher = stronger prompt adherence) |
| cfg_ref=2.0, # reference guidance scale (higher = more similar to ref) |
| seed=42, # reproducibility |
| ) |
| ``` |
|
|
| ### Load from local weights |
|
|
| ```python |
| # From a .safetensors file |
| model = Alucard.from_pretrained("path/to/alucard_model.safetensors") |
| |
| # From a training checkpoint |
| model = Alucard.from_pretrained("path/to/best.pt") |
| |
| # From a local directory containing alucard_model.safetensors |
| model = Alucard.from_pretrained("path/to/model_dir/") |
| ``` |
|
|
| ## Architecture |
|
|
| | Property | Value | |
| |----------|-------| |
| | Parameters | 31,956,228 (32M) | |
| | Input | 128x128 RGBA (4ch noisy + 4ch reference) | |
| | Output | 128x128 RGBA | |
| | Text encoder | CLIP ViT-B/32 (frozen, 512-dim) | |
| | Conditioning | AdaLN-Zero | |
| | Training | Flow matching (rectified flow) | |
| | Base channels | 64, multipliers [1, 2, 4, 4] | |
| | Attention | Self-attention at 32x32 and 16x16 | |
|
|
| ## Training |
|
|
| Trained on 33K sprites from publicly available datasets (Kaggle Pixel Art, Kenney CC0, GameTileNet, Pixel Art Nouns, TinyHero). |
|
|
| ## License |
|
|
| Released under the [FAIR License (Free for Attribution and Individual Rights) v1.0.0](LICENSE). |
|
|
| - **Non-commercial use** (personal, educational, research, non-profit) is freely permitted under the terms of the license. |
| - **Commercial use** (SaaS, paid apps, any monetization) requires visible attribution to the project and its author. See the [license](LICENSE) for details. |
| - **Business use** (any use by or on behalf of a business entity) requires a signed commercial agreement with the author. Contact `evilsocket@gmail.com` for inquiries. |
|
|