evilsocket commited on
Commit
bd83c61
·
verified ·
1 Parent(s): d14cdfb

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +75 -35
README.md CHANGED
@@ -20,49 +20,89 @@ A small (32M parameter) text-to-sprite generative model using flow matching. Gen
20
 
21
  **GitHub**: [evilsocket/alucard](https://github.com/evilsocket/alucard)
22
 
23
- ## Architecture
24
 
25
- - **UNet** (32M params) with 8-channel input (4 noisy RGBA + 4 reference RGBA)
26
- - **AdaLN-Zero** conditioning from CLIP ViT-B/32 text embeddings + sinusoidal timestep
27
- - **Flow matching** (rectified flow) training objective
28
- - **Dual classifier-free guidance** for independent text and reference frame control
29
- - Self-attention at 32x32 and 16x16 resolutions
30
 
31
- ## Two Modes
32
 
33
- 1. **Text to Sprite** - generate a sprite from a text prompt alone
34
- 2. **Text + Reference to Sprite** - generate the next animation frame conditioned on a previous frame and text describing the change
35
 
36
- ## Usage
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
 
38
  ```python
39
- import torch
40
- from safetensors.torch import load_file
41
- from alucard.model import UNet
42
- from alucard.sample import sample
43
-
44
- # Load model
45
- state_dict = load_file("alucard_model.safetensors")
46
- model = UNet()
47
- model.load_state_dict(state_dict)
48
- model = model.cuda().eval()
49
-
50
- # Encode text with CLIP
51
- import open_clip
52
- clip_model, _, _ = open_clip.create_model_and_transforms("ViT-B-32", pretrained="openai")
53
- tokenizer = open_clip.get_tokenizer("ViT-B-32")
54
- clip_model = clip_model.cuda().eval()
55
-
56
- tokens = tokenizer(["a pixel art knight sprite"]).cuda()
57
- with torch.no_grad():
58
- text_emb = clip_model.encode_text(tokens)
59
- text_emb = text_emb / text_emb.norm(dim=-1, keepdim=True)
60
-
61
- # Generate
62
- sprites = sample(model, text_emb, num_steps=20, cfg_text=5.0, device="cuda")
63
  ```
64
 
65
- ## Model Details
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
66
 
67
  | Property | Value |
68
  |----------|-------|
 
20
 
21
  **GitHub**: [evilsocket/alucard](https://github.com/evilsocket/alucard)
22
 
23
+ ## Installation
24
 
25
+ ```bash
26
+ pip install git+https://github.com/evilsocket/alucard.git
27
+ ```
 
 
28
 
29
+ ## Usage
30
 
31
+ ### Generate a sprite from text
 
32
 
33
+ ```python
34
+ from alucard import Alucard
35
+
36
+ # Load model (downloads weights automatically from HuggingFace)
37
+ model = Alucard.from_pretrained("evilsocket/alucard")
38
+
39
+ # Generate a sprite
40
+ sprite = model("a pixel art knight sprite, idle pose")
41
+ sprite.save("knight.png")
42
+
43
+ # Generate multiple variations
44
+ sprites = model("a pixel art dragon enemy sprite", num_samples=4, seed=42)
45
+ for i, s in enumerate(sprites):
46
+ s.save(f"dragon_{i}.png")
47
+ ```
48
+
49
+ ### Generate an animation sequence
50
+
51
+ Use the `ref` parameter to condition generation on a previous frame:
52
 
53
  ```python
54
+ from alucard import Alucard
55
+
56
+ model = Alucard.from_pretrained("evilsocket/alucard")
57
+
58
+ # Generate the first frame
59
+ frame_1 = model("a pixel art knight sprite, walking right, frame 1")
60
+ frame_1.save("walk_01.png")
61
+
62
+ # Generate subsequent frames by passing the previous frame as reference
63
+ frame_2 = model("a pixel art knight sprite, walking right, frame 2", ref=frame_1)
64
+ frame_2.save("walk_02.png")
65
+
66
+ frame_3 = model("a pixel art knight sprite, walking right, frame 3", ref=frame_2)
67
+ frame_3.save("walk_03.png")
68
+
69
+ frame_4 = model("a pixel art knight sprite, walking right, frame 4", ref=frame_3)
70
+ frame_4.save("walk_04.png")
 
 
 
 
 
 
 
71
  ```
72
 
73
+ You can also pass a file path as `ref`:
74
+
75
+ ```python
76
+ sprite = model("a pixel art knight sprite, attack pose", ref="walk_01.png")
77
+ ```
78
+
79
+ ### Generation parameters
80
+
81
+ ```python
82
+ sprite = model(
83
+ "a pixel art wizard sprite",
84
+ num_samples=1, # number of images to generate
85
+ num_steps=20, # Euler ODE steps (more = better quality, slower)
86
+ cfg_text=5.0, # text guidance scale (higher = stronger prompt adherence)
87
+ cfg_ref=2.0, # reference guidance scale (higher = more similar to ref)
88
+ seed=42, # reproducibility
89
+ )
90
+ ```
91
+
92
+ ### Load from local weights
93
+
94
+ ```python
95
+ # From a .safetensors file
96
+ model = Alucard.from_pretrained("path/to/alucard_model.safetensors")
97
+
98
+ # From a training checkpoint
99
+ model = Alucard.from_pretrained("path/to/best.pt")
100
+
101
+ # From a local directory containing alucard_model.safetensors
102
+ model = Alucard.from_pretrained("path/to/model_dir/")
103
+ ```
104
+
105
+ ## Architecture
106
 
107
  | Property | Value |
108
  |----------|-------|