evilsocket
/

alucard

@@ -20,49 +20,89 @@ A small (32M parameter) text-to-sprite generative model using flow matching. Gen
 **GitHub**: [evilsocket/alucard](https://github.com/evilsocket/alucard)
-## Architecture
-- **UNet** (32M params) with 8-channel input (4 noisy RGBA + 4 reference RGBA)
-- **AdaLN-Zero** conditioning from CLIP ViT-B/32 text embeddings + sinusoidal timestep
-- **Flow matching** (rectified flow) training objective
-- **Dual classifier-free guidance** for independent text and reference frame control
-- Self-attention at 32x32 and 16x16 resolutions
-## Two Modes
-1. **Text to Sprite** - generate a sprite from a text prompt alone
-2. **Text + Reference to Sprite** - generate the next animation frame conditioned on a previous frame and text describing the change
-## Usage
 ```python
-import torch
-from safetensors.torch import load_file
-from alucard.model import UNet
-from alucard.sample import sample
-# Load model
-state_dict = load_file("alucard_model.safetensors")
-model = UNet()
-model.load_state_dict(state_dict)
-model = model.cuda().eval()
-# Encode text with CLIP
-import open_clip
-clip_model, _, _ = open_clip.create_model_and_transforms("ViT-B-32", pretrained="openai")
-tokenizer = open_clip.get_tokenizer("ViT-B-32")
-clip_model = clip_model.cuda().eval()
-tokens = tokenizer(["a pixel art knight sprite"]).cuda()
-with torch.no_grad():
-    text_emb = clip_model.encode_text(tokens)
-    text_emb = text_emb / text_emb.norm(dim=-1, keepdim=True)
-# Generate
-sprites = sample(model, text_emb, num_steps=20, cfg_text=5.0, device="cuda")
 ```
-## Model Details
 | Property | Value |
 |----------|-------|

 **GitHub**: [evilsocket/alucard](https://github.com/evilsocket/alucard)
+## Installation
+```bash
+pip install git+https://github.com/evilsocket/alucard.git
+```
+## Usage
+### Generate a sprite from text
+```python
+from alucard import Alucard
+# Load model (downloads weights automatically from HuggingFace)
+model = Alucard.from_pretrained("evilsocket/alucard")
+# Generate a sprite
+sprite = model("a pixel art knight sprite, idle pose")
+sprite.save("knight.png")
+# Generate multiple variations
+sprites = model("a pixel art dragon enemy sprite", num_samples=4, seed=42)
+for i, s in enumerate(sprites):
+    s.save(f"dragon_{i}.png")
+```
+### Generate an animation sequence
+Use the `ref` parameter to condition generation on a previous frame:
 ```python
+from alucard import Alucard
+model = Alucard.from_pretrained("evilsocket/alucard")
+# Generate the first frame
+frame_1 = model("a pixel art knight sprite, walking right, frame 1")
+frame_1.save("walk_01.png")
+# Generate subsequent frames by passing the previous frame as reference
+frame_2 = model("a pixel art knight sprite, walking right, frame 2", ref=frame_1)
+frame_2.save("walk_02.png")
+frame_3 = model("a pixel art knight sprite, walking right, frame 3", ref=frame_2)
+frame_3.save("walk_03.png")
+frame_4 = model("a pixel art knight sprite, walking right, frame 4", ref=frame_3)
+frame_4.save("walk_04.png")
 ```
+You can also pass a file path as `ref`:
+```python
+sprite = model("a pixel art knight sprite, attack pose", ref="walk_01.png")
+```
+### Generation parameters
+```python
+sprite = model(
+    "a pixel art wizard sprite",
+    num_samples=1,     # number of images to generate
+    num_steps=20,      # Euler ODE steps (more = better quality, slower)
+    cfg_text=5.0,      # text guidance scale (higher = stronger prompt adherence)
+    cfg_ref=2.0,       # reference guidance scale (higher = more similar to ref)
+    seed=42,           # reproducibility
+)
+```
+### Load from local weights
+```python
+# From a .safetensors file
+model = Alucard.from_pretrained("path/to/alucard_model.safetensors")
+# From a training checkpoint
+model = Alucard.from_pretrained("path/to/best.pt")
+# From a local directory containing alucard_model.safetensors
+model = Alucard.from_pretrained("path/to/model_dir/")
+```
+## Architecture
 | Property | Value |
 |----------|-------|