Alucard
A small (32M parameter) text-to-sprite generative model using flow matching. Generates 128x128 RGBA sprites from text prompts, with optional reference frame input for animation generation.
GitHub: evilsocket/alucard
Installation
pip install git+https://github.com/evilsocket/alucard.git
Usage
Generate a sprite from text
from alucard import Alucard
# Load model (downloads weights automatically from HuggingFace)
model = Alucard.from_pretrained("evilsocket/alucard")
# Generate a sprite
sprite = model("a pixel art knight sprite, idle pose")
sprite.save("knight.png")
# Generate multiple variations
sprites = model("a pixel art dragon enemy sprite", num_samples=4, seed=42)
for i, s in enumerate(sprites):
s.save(f"dragon_{i}.png")
Generate an animation sequence
Use the ref parameter to condition generation on a previous frame:
from alucard import Alucard
model = Alucard.from_pretrained("evilsocket/alucard")
# Generate the first frame
frame_1 = model("a pixel art knight sprite, walking right, frame 1")
frame_1.save("walk_01.png")
# Generate subsequent frames by passing the previous frame as reference
frame_2 = model("a pixel art knight sprite, walking right, frame 2", ref=frame_1)
frame_2.save("walk_02.png")
frame_3 = model("a pixel art knight sprite, walking right, frame 3", ref=frame_2)
frame_3.save("walk_03.png")
frame_4 = model("a pixel art knight sprite, walking right, frame 4", ref=frame_3)
frame_4.save("walk_04.png")
You can also pass a file path as ref:
sprite = model("a pixel art knight sprite, attack pose", ref="walk_01.png")
Generation parameters
sprite = model(
"a pixel art wizard sprite",
num_samples=1, # number of images to generate
num_steps=20, # Euler ODE steps (more = better quality, slower)
cfg_text=5.0, # text guidance scale (higher = stronger prompt adherence)
cfg_ref=2.0, # reference guidance scale (higher = more similar to ref)
seed=42, # reproducibility
)
Load from local weights
# From a .safetensors file
model = Alucard.from_pretrained("path/to/alucard_model.safetensors")
# From a training checkpoint
model = Alucard.from_pretrained("path/to/best.pt")
# From a local directory containing alucard_model.safetensors
model = Alucard.from_pretrained("path/to/model_dir/")
Architecture
| Property | Value |
|---|---|
| Parameters | 31,956,228 (32M) |
| Input | 128x128 RGBA (4ch noisy + 4ch reference) |
| Output | 128x128 RGBA |
| Text encoder | CLIP ViT-B/32 (frozen, 512-dim) |
| Conditioning | AdaLN-Zero |
| Training | Flow matching (rectified flow) |
| Base channels | 64, multipliers [1, 2, 4, 4] |
| Attention | Self-attention at 32x32 and 16x16 |
Training
Trained on 33K sprites from publicly available datasets (Kaggle Pixel Art, Kenney CC0, GameTileNet, Pixel Art Nouns, TinyHero).
License
Released under the FAIR License (Free for Attribution and Individual Rights) v1.0.0.
- Non-commercial use (personal, educational, research, non-profit) is freely permitted under the terms of the license.
- Commercial use (SaaS, paid apps, any monetization) requires visible attribution to the project and its author. See the license for details.
- Business use (any use by or on behalf of a business entity) requires a signed commercial agreement with the author. Contact
evilsocket@gmail.comfor inquiries.
- Downloads last month
- 443