alucard / README.md

evilsocket

Upload README.md with huggingface_hub

bd83c61 verified 3 days ago

preview code

raw

history blame contribute delete

3.8 kB

metadata

license: other
license_name: fair-1.0.0
license_link: LICENSE
library_name: pytorch
tags:
  - image-generation
  - pixel-art
  - sprites
  - flow-matching
  - diffusion
  - text-to-image
  - game-assets
pipeline_tag: text-to-image

Alucard

A small (32M parameter) text-to-sprite generative model using flow matching. Generates 128x128 RGBA sprites from text prompts, with optional reference frame input for animation generation.

GitHub: evilsocket/alucard

Installation

pip install git+https://github.com/evilsocket/alucard.git

Usage

Generate a sprite from text

from alucard import Alucard

# Load model (downloads weights automatically from HuggingFace)
model = Alucard.from_pretrained("evilsocket/alucard")

# Generate a sprite
sprite = model("a pixel art knight sprite, idle pose")
sprite.save("knight.png")

# Generate multiple variations
sprites = model("a pixel art dragon enemy sprite", num_samples=4, seed=42)
for i, s in enumerate(sprites):
    s.save(f"dragon_{i}.png")

Generate an animation sequence

Use the ref parameter to condition generation on a previous frame:

from alucard import Alucard

model = Alucard.from_pretrained("evilsocket/alucard")

# Generate the first frame
frame_1 = model("a pixel art knight sprite, walking right, frame 1")
frame_1.save("walk_01.png")

# Generate subsequent frames by passing the previous frame as reference
frame_2 = model("a pixel art knight sprite, walking right, frame 2", ref=frame_1)
frame_2.save("walk_02.png")

frame_3 = model("a pixel art knight sprite, walking right, frame 3", ref=frame_2)
frame_3.save("walk_03.png")

frame_4 = model("a pixel art knight sprite, walking right, frame 4", ref=frame_3)
frame_4.save("walk_04.png")

You can also pass a file path as ref:

sprite = model("a pixel art knight sprite, attack pose", ref="walk_01.png")

Generation parameters

sprite = model(
    "a pixel art wizard sprite",
    num_samples=1,     # number of images to generate
    num_steps=20,      # Euler ODE steps (more = better quality, slower)
    cfg_text=5.0,      # text guidance scale (higher = stronger prompt adherence)
    cfg_ref=2.0,       # reference guidance scale (higher = more similar to ref)
    seed=42,           # reproducibility
)

Load from local weights

# From a .safetensors file
model = Alucard.from_pretrained("path/to/alucard_model.safetensors")

# From a training checkpoint
model = Alucard.from_pretrained("path/to/best.pt")

# From a local directory containing alucard_model.safetensors
model = Alucard.from_pretrained("path/to/model_dir/")

Architecture

Property	Value
Parameters	31,956,228 (32M)
Input	128x128 RGBA (4ch noisy + 4ch reference)
Output	128x128 RGBA
Text encoder	CLIP ViT-B/32 (frozen, 512-dim)
Conditioning	AdaLN-Zero
Training	Flow matching (rectified flow)
Base channels	64, multipliers [1, 2, 4, 4]
Attention	Self-attention at 32x32 and 16x16

Training

Trained on 33K sprites from publicly available datasets (Kaggle Pixel Art, Kenney CC0, GameTileNet, Pixel Art Nouns, TinyHero).

License

Released under the FAIR License (Free for Attribution and Individual Rights) v1.0.0.

Non-commercial use (personal, educational, research, non-profit) is freely permitted under the terms of the license.
Commercial use (SaaS, paid apps, any monetization) requires visible attribution to the project and its author. See the license for details.
Business use (any use by or on behalf of a business entity) requires a signed commercial agreement with the author. Contact evilsocket@gmail.com for inquiries.