---
license: other
license_name: fair-1.0.0
license_link: LICENSE
library_name: pytorch
tags:
  - image-generation
  - pixel-art
  - sprites
  - flow-matching
  - diffusion
  - text-to-image
  - game-assets
pipeline_tag: text-to-image
---

# Alucard

A small (32M parameter) text-to-sprite generative model using flow matching. Generates 128x128 RGBA sprites from text prompts, with optional reference frame input for animation generation.

**GitHub**: [evilsocket/alucard](https://github.com/evilsocket/alucard)

## Installation

```bash
pip install git+https://github.com/evilsocket/alucard.git
```

## Usage

### Generate a sprite from text

```python
from alucard import Alucard

# Load model (downloads weights automatically from HuggingFace)
model = Alucard.from_pretrained("evilsocket/alucard")

# Generate a sprite
sprite = model("a pixel art knight sprite, idle pose")
sprite.save("knight.png")

# Generate multiple variations
sprites = model("a pixel art dragon enemy sprite", num_samples=4, seed=42)
for i, s in enumerate(sprites):
    s.save(f"dragon_{i}.png")
```

### Generate an animation sequence

Use the `ref` parameter to condition generation on a previous frame:

```python
from alucard import Alucard

model = Alucard.from_pretrained("evilsocket/alucard")

# Generate the first frame
frame_1 = model("a pixel art knight sprite, walking right, frame 1")
frame_1.save("walk_01.png")

# Generate subsequent frames by passing the previous frame as reference
frame_2 = model("a pixel art knight sprite, walking right, frame 2", ref=frame_1)
frame_2.save("walk_02.png")

frame_3 = model("a pixel art knight sprite, walking right, frame 3", ref=frame_2)
frame_3.save("walk_03.png")

frame_4 = model("a pixel art knight sprite, walking right, frame 4", ref=frame_3)
frame_4.save("walk_04.png")
```

You can also pass a file path as `ref`:

```python
sprite = model("a pixel art knight sprite, attack pose", ref="walk_01.png")
```

### Generation parameters

```python
sprite = model(
    "a pixel art wizard sprite",
    num_samples=1,     # number of images to generate
    num_steps=20,      # Euler ODE steps (more = better quality, slower)
    cfg_text=5.0,      # text guidance scale (higher = stronger prompt adherence)
    cfg_ref=2.0,       # reference guidance scale (higher = more similar to ref)
    seed=42,           # reproducibility
)
```

### Load from local weights

```python
# From a .safetensors file
model = Alucard.from_pretrained("path/to/alucard_model.safetensors")

# From a training checkpoint
model = Alucard.from_pretrained("path/to/best.pt")

# From a local directory containing alucard_model.safetensors
model = Alucard.from_pretrained("path/to/model_dir/")
```

## Architecture

| Property | Value |
|----------|-------|
| Parameters | 31,956,228 (32M) |
| Input | 128x128 RGBA (4ch noisy + 4ch reference) |
| Output | 128x128 RGBA |
| Text encoder | CLIP ViT-B/32 (frozen, 512-dim) |
| Conditioning | AdaLN-Zero |
| Training | Flow matching (rectified flow) |
| Base channels | 64, multipliers [1, 2, 4, 4] |
| Attention | Self-attention at 32x32 and 16x16 |

## Training

Trained on 33K sprites from publicly available datasets (Kaggle Pixel Art, Kenney CC0, GameTileNet, Pixel Art Nouns, TinyHero).

## License

Released under the [FAIR License (Free for Attribution and Individual Rights) v1.0.0](LICENSE).

- **Non-commercial use** (personal, educational, research, non-profit) is freely permitted under the terms of the license.
- **Commercial use** (SaaS, paid apps, any monetization) requires visible attribution to the project and its author. See the [license](LICENSE) for details.
- **Business use** (any use by or on behalf of a business entity) requires a signed commercial agreement with the author. Contact `evilsocket@gmail.com` for inquiries.