AnimTOON-3B (v3): Token-Efficient Vector Animation Generation
3-4x fewer tokens than OmniLottie (CVPR 2026) for generating Lottie animations. Now with character animation support.
| AnimTOON | OmniLottie | |
|---|---|---|
| Tokens (simple) | 166 | 616 |
| Tokens (complex) | 597 | 4095 |
| VRAM | 5GB | 15.2GB |
| FPS | 30 | 8 |
| Model Size | 3B LoRA | 4B full |
| Custom Tokenizer | No | Yes (40k tokens) |
| Accepts SVG | Yes | No |
What is AnimTOON?
AnimTOON is a compact, plain-text animation format that any LLM can generate. Instead of outputting 18,000+ tokens of raw Lottie JSON, AnimTOON describes animations in ~166-597 tokens of human-readable text.
anim fr=30 dur=120
layer Logo shape
fill #000000
path sh x2
pos [0.5,0.5]
rot 0.0->-67 0.04->46 0.14->-31 0.28->0 ease=bounce
scale 0.0->[0,0] 0.14->[90,90] 0.28->[100,100] ease=smooth
opacity 0.0->0 0.14->100 ease=fade
This produces a complete animated .lottie file with bounce entrance, rotation wobble, and fade-in.
How to Use
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tokenizer = AutoTokenizer.from_pretrained("srk0102200/AnimTOON-3B")
model = AutoModelForCausalLM.from_pretrained(
"srk0102200/AnimTOON-3B",
dtype=torch.float16,
device_map="cuda"
)
prompt = "a red circle pulsing in the center with a smooth bounce"
messages = [{"role": "user", "content": f"Generate AnimTOON animation: {prompt}"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to("cuda")
with torch.no_grad():
out = model.generate(**inputs, max_new_tokens=512, temperature=0.7, do_sample=True)
result = tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(result)
Convert to .lottie
# Clone: git clone https://github.com/srk0102/AnimTOON.git
import sys; sys.path.insert(0, 'src')
from toon_animator import animtoon_to_dotlottie_full
animtoon_to_dotlottie_full(result, "output.lottie")
# Preview at https://lottiefiles.com/preview
Animate Any SVG
from lottie import parsers # pip install lottie
# Convert SVG to Lottie (perfect paths)
anim = parsers.svg.parse_svg_file("your_logo.svg")
lottie_dict = anim.to_dict()
# Generate AnimTOON animations with the model
# Apply animations to the Lottie layers
# Output: .lottie file with real SVG shapes + AI animations
See full pipeline: test_svg_pipeline.py
Benchmark Results (Measured)
Same prompt, same hardware:
| Test | AnimTOON Tokens | OmniLottie Tokens | Ratio |
|---|---|---|---|
| Apple logo bounce | 207 (41 shape + 166 anim) | 1113 | 5.4x fewer |
| Smiley face complex | 597 | 4095 | 6.9x fewer |
| Simple ball bounce | 176 | 616 | 3.5x fewer |
Dataset statistics (99,650 samples):
- Average raw Lottie JSON: 18,202 tokens
- Average AnimTOON: 222 tokens
- Token reduction: 98.8%
Current Status (v3)
v3 adds character animation support trained on Spine + DragonBones skeletal data.
The model now works for:
- Icon/logo animations (pulse, bounce, spin, fade, wobble)
- Character idle/walk cycles (14 layers, coordinated)
- Multi-part SVG animation (47-part crab demo)
- Correct color matching from text descriptions
- SVG + animation pipeline with per-part anchor points
Limitations:
- No shape generation (requires SVG input)
- Model output varies between runs (temperature-dependent)
- Position animation on shape groups not yet supported
- Not yet trained on facial expressions
Training Details
| Parameter | Value |
|---|---|
| Base Model | Qwen/Qwen2.5-3B-Instruct |
| Method | LoRA (r=16, alpha=32) merged into base |
| Version | v3 (final 3B Lite release) |
| Training Data | 99,650 (MMLottie-2M) + 10,000 (layer-aware) + 984 (Spine/DragonBones) |
| Hardware | 1x NVIDIA RTX 5060 Ti (16GB) |
| Framework | Unsloth |
| Token Reduction | 98.8% vs raw Lottie JSON |
Architecture: Why Animation-Only is Better
"Asking one model to draw AND animate is like asking one person to paint AND dance at the same time."
AnimTOON separates concerns:
- SVG provides shapes (perfect, no hallucination, 0 tokens)
- Model generates animation (focused, token-efficient)
- Converter merges them (deterministic, 100% valid output)
OmniLottie generates everything in one model → hallucinated shapes, token bloat (2001 tokens for a "crab" that looks like binoculars).
Links
- GitHub: github.com/srk0102/AnimTOON
- PitchHut: pitchhut.com/project/animtoon-lottie-animation
- OmniLottie (comparison): arxiv.org/abs/2603.02138
- MMLottie-2M Dataset: huggingface.co/datasets/OmniLottie/MMLottie-2M
Citation
@misc{sivaramakrishna2026animtoon,
title={AnimTOON: Token-Efficient Vector Animation Generation via Compact Text Format},
author={Siva RamaKrishna},
year={2026},
url={https://github.com/srk0102/AnimTOON}
}
License
MIT License - see LICENSE
- Downloads last month
- 516