---
language:
- en
license: mit
library_name: transformers
base_model: Qwen/Qwen2.5-3B-Instruct
tags:
- animation
- lottie
- svg
- animtoon
- vector-animation
- text-to-animation
- conversational
- text-generation-inference
datasets:
- OmniLottie/MMLottie-2M
pipeline_tag: text-generation
---

# AnimTOON-3B (v3): Token-Efficient Vector Animation Generation

**3-4x fewer tokens than OmniLottie (CVPR 2026) for generating Lottie animations. Now with character animation support.**

| | AnimTOON | OmniLottie |
|---|---|---|
| **Tokens (simple)** | **166** | 616 |
| **Tokens (complex)** | **597** | 4095 |
| **VRAM** | **5GB** | 15.2GB |
| **FPS** | **30** | 8 |
| **Model Size** | **3B LoRA** | 4B full |
| **Custom Tokenizer** | **No** | Yes (40k tokens) |
| **Accepts SVG** | **Yes** | No |

## What is AnimTOON?

AnimTOON is a compact, plain-text animation format that any LLM can generate. Instead of outputting 18,000+ tokens of raw Lottie JSON, AnimTOON describes animations in ~166-597 tokens of human-readable text.

```
anim fr=30 dur=120

layer Logo shape
  fill #000000
  path sh x2
  pos [0.5,0.5]
  rot 0.0->-67 0.04->46 0.14->-31 0.28->0 ease=bounce
  scale 0.0->[0,0] 0.14->[90,90] 0.28->[100,100] ease=smooth
  opacity 0.0->0 0.14->100 ease=fade
```

This produces a complete animated .lottie file with bounce entrance, rotation wobble, and fade-in.

## How to Use

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

tokenizer = AutoTokenizer.from_pretrained("srk0102200/AnimTOON-3B")
model = AutoModelForCausalLM.from_pretrained(
    "srk0102200/AnimTOON-3B",
    dtype=torch.float16,
    device_map="cuda"
)

prompt = "a red circle pulsing in the center with a smooth bounce"
messages = [{"role": "user", "content": f"Generate AnimTOON animation: {prompt}"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to("cuda")

with torch.no_grad():
    out = model.generate(**inputs, max_new_tokens=512, temperature=0.7, do_sample=True)
result = tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(result)
```

## Convert to .lottie

```python
# Clone: git clone https://github.com/srk0102/AnimTOON.git
import sys; sys.path.insert(0, 'src')
from toon_animator import animtoon_to_dotlottie_full

animtoon_to_dotlottie_full(result, "output.lottie")
# Preview at https://lottiefiles.com/preview
```

## Animate Any SVG

```python
from lottie import parsers  # pip install lottie

# Convert SVG to Lottie (perfect paths)
anim = parsers.svg.parse_svg_file("your_logo.svg")
lottie_dict = anim.to_dict()

# Generate AnimTOON animations with the model
# Apply animations to the Lottie layers
# Output: .lottie file with real SVG shapes + AI animations
```

See full pipeline: [test_svg_pipeline.py](https://github.com/srk0102/AnimTOON/blob/master/test_svg_pipeline.py)

## Benchmark Results (Measured)

**Same prompt, same hardware:**

| Test | AnimTOON Tokens | OmniLottie Tokens | Ratio |
|------|----------------|-------------------|-------|
| Apple logo bounce | 207 (41 shape + 166 anim) | 1113 | 5.4x fewer |
| Smiley face complex | 597 | 4095 | 6.9x fewer |
| Simple ball bounce | 176 | 616 | 3.5x fewer |

**Dataset statistics (99,650 samples):**
- Average raw Lottie JSON: 18,202 tokens
- Average AnimTOON: 222 tokens
- Token reduction: 98.8%

## Current Status (v3)

**v3 adds character animation support** trained on Spine + DragonBones skeletal data.

The model now works for:
- Icon/logo animations (pulse, bounce, spin, fade, wobble)
- **Character idle/walk cycles (14 layers, coordinated)**
- **Multi-part SVG animation (47-part crab demo)**
- Correct color matching from text descriptions
- SVG + animation pipeline with per-part anchor points

**Limitations:**
- No shape generation (requires SVG input)
- Model output varies between runs (temperature-dependent)
- Position animation on shape groups not yet supported
- Not yet trained on facial expressions

## Training Details

| Parameter | Value |
|-----------|-------|
| Base Model | Qwen/Qwen2.5-3B-Instruct |
| Method | LoRA (r=16, alpha=32) merged into base |
| Version | v3 (final 3B Lite release) |
| Training Data | 99,650 (MMLottie-2M) + 10,000 (layer-aware) + 984 (Spine/DragonBones) |
| Hardware | 1x NVIDIA RTX 5060 Ti (16GB) |
| Framework | Unsloth |
| Token Reduction | 98.8% vs raw Lottie JSON |

## Architecture: Why Animation-Only is Better

> "Asking one model to draw AND animate is like asking one person to paint AND dance at the same time."

AnimTOON separates concerns:
- **SVG provides shapes** (perfect, no hallucination, 0 tokens)
- **Model generates animation** (focused, token-efficient)
- **Converter merges them** (deterministic, 100% valid output)

OmniLottie generates everything in one model → hallucinated shapes, token bloat (2001 tokens for a "crab" that looks like binoculars).

## Links

- **GitHub:** [github.com/srk0102/AnimTOON](https://github.com/srk0102/AnimTOON)
- **PitchHut:** [pitchhut.com/project/animtoon-lottie-animation](https://www.pitchhut.com/project/animtoon-lottie-animation)
- **OmniLottie (comparison):** [arxiv.org/abs/2603.02138](https://arxiv.org/abs/2603.02138)
- **MMLottie-2M Dataset:** [huggingface.co/datasets/OmniLottie/MMLottie-2M](https://huggingface.co/datasets/OmniLottie/MMLottie-2M)

## Citation

```bibtex
@misc{sivaramakrishna2026animtoon,
  title={AnimTOON: Token-Efficient Vector Animation Generation via Compact Text Format},
  author={Siva RamaKrishna},
  year={2026},
  url={https://github.com/srk0102/AnimTOON}
}
```

## License

MIT License - see [LICENSE](https://github.com/srk0102/AnimTOON/blob/master/LICENSE)