Update README.md

f1ae30a verified 1 day ago

5.73 kB

	---
	language:
	- en
	license: mit
	library_name: transformers
	base_model: Qwen/Qwen2.5-3B-Instruct
	tags:
	- animation
	- lottie
	- svg
	- animtoon
	- vector-animation
	- text-to-animation
	- conversational
	- text-generation-inference
	datasets:
	- OmniLottie/MMLottie-2M
	pipeline_tag: text-generation
	---

	# AnimTOON-3B (v3): Token-Efficient Vector Animation Generation

	3-4x fewer tokens than OmniLottie (CVPR 2026) for generating Lottie animations. Now with character animation support.

	\| \| AnimTOON \| OmniLottie \|
	\|---\|---\|---\|
	\| Tokens (simple) \| 166 \| 616 \|
	\| Tokens (complex) \| 597 \| 4095 \|
	\| VRAM \| 5GB \| 15.2GB \|
	\| FPS \| 30 \| 8 \|
	\| Model Size \| 3B LoRA \| 4B full \|
	\| Custom Tokenizer \| No \| Yes (40k tokens) \|
	\| Accepts SVG \| Yes \| No \|

	## What is AnimTOON?

	AnimTOON is a compact, plain-text animation format that any LLM can generate. Instead of outputting 18,000+ tokens of raw Lottie JSON, AnimTOON describes animations in ~166-597 tokens of human-readable text.

	```
	anim fr=30 dur=120

	layer Logo shape
	fill #000000
	path sh x2
	pos [0.5,0.5]
	rot 0.0->-67 0.04->46 0.14->-31 0.28->0 ease=bounce
	scale 0.0->[0,0] 0.14->[90,90] 0.28->[100,100] ease=smooth
	opacity 0.0->0 0.14->100 ease=fade
	```

	This produces a complete animated .lottie file with bounce entrance, rotation wobble, and fade-in.

	## How to Use

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	tokenizer = AutoTokenizer.from_pretrained("srk0102200/AnimTOON-3B")
	model = AutoModelForCausalLM.from_pretrained(
	"srk0102200/AnimTOON-3B",
	dtype=torch.float16,
	device_map="cuda"
	)

	prompt = "a red circle pulsing in the center with a smooth bounce"
	messages = [{"role": "user", "content": f"Generate AnimTOON animation: {prompt}"}]
	text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	inputs = tokenizer(text, return_tensors="pt").to("cuda")

	with torch.no_grad():
	out = model.generate(**inputs, max_new_tokens=512, temperature=0.7, do_sample=True)
	result = tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
	print(result)
	```

	## Convert to .lottie

	```python
	# Clone: git clone https://github.com/srk0102/AnimTOON.git
	import sys; sys.path.insert(0, 'src')
	from toon_animator import animtoon_to_dotlottie_full

	animtoon_to_dotlottie_full(result, "output.lottie")
	# Preview at https://lottiefiles.com/preview
	```

	## Animate Any SVG

	```python
	from lottie import parsers # pip install lottie

	# Convert SVG to Lottie (perfect paths)
	anim = parsers.svg.parse_svg_file("your_logo.svg")
	lottie_dict = anim.to_dict()

	# Generate AnimTOON animations with the model
	# Apply animations to the Lottie layers
	# Output: .lottie file with real SVG shapes + AI animations
	```

	See full pipeline: [test_svg_pipeline.py](https://github.com/srk0102/AnimTOON/blob/master/test_svg_pipeline.py)

	## Benchmark Results (Measured)

	Same prompt, same hardware:

	\| Test \| AnimTOON Tokens \| OmniLottie Tokens \| Ratio \|
	\|------\|----------------\|-------------------\|-------\|
	\| Apple logo bounce \| 207 (41 shape + 166 anim) \| 1113 \| 5.4x fewer \|
	\| Smiley face complex \| 597 \| 4095 \| 6.9x fewer \|
	\| Simple ball bounce \| 176 \| 616 \| 3.5x fewer \|

	Dataset statistics (99,650 samples):
	- Average raw Lottie JSON: 18,202 tokens
	- Average AnimTOON: 222 tokens
	- Token reduction: 98.8%

	## Current Status (v3)

	v3 adds character animation support trained on Spine + DragonBones skeletal data.

	The model now works for:
	- Icon/logo animations (pulse, bounce, spin, fade, wobble)
	- Character idle/walk cycles (14 layers, coordinated)
	- Multi-part SVG animation (47-part crab demo)
	- Correct color matching from text descriptions
	- SVG + animation pipeline with per-part anchor points

	Limitations:
	- No shape generation (requires SVG input)
	- Model output varies between runs (temperature-dependent)
	- Position animation on shape groups not yet supported
	- Not yet trained on facial expressions

	## Training Details

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Base Model \| Qwen/Qwen2.5-3B-Instruct \|
	\| Method \| LoRA (r=16, alpha=32) merged into base \|
	\| Version \| v3 (final 3B Lite release) \|
	\| Training Data \| 99,650 (MMLottie-2M) + 10,000 (layer-aware) + 984 (Spine/DragonBones) \|
	\| Hardware \| 1x NVIDIA RTX 5060 Ti (16GB) \|
	\| Framework \| Unsloth \|
	\| Token Reduction \| 98.8% vs raw Lottie JSON \|

	## Architecture: Why Animation-Only is Better

	> "Asking one model to draw AND animate is like asking one person to paint AND dance at the same time."

	AnimTOON separates concerns:
	- SVG provides shapes (perfect, no hallucination, 0 tokens)
	- Model generates animation (focused, token-efficient)
	- Converter merges them (deterministic, 100% valid output)

	OmniLottie generates everything in one model → hallucinated shapes, token bloat (2001 tokens for a "crab" that looks like binoculars).

	## Links

	- GitHub: [github.com/srk0102/AnimTOON](https://github.com/srk0102/AnimTOON)
	- PitchHut: [pitchhut.com/project/animtoon-lottie-animation](https://www.pitchhut.com/project/animtoon-lottie-animation)
	- OmniLottie (comparison): [arxiv.org/abs/2603.02138](https://arxiv.org/abs/2603.02138)
	- MMLottie-2M Dataset: [huggingface.co/datasets/OmniLottie/MMLottie-2M](https://huggingface.co/datasets/OmniLottie/MMLottie-2M)

	## Citation

	```bibtex
	@misc{sivaramakrishna2026animtoon,
	title={AnimTOON: Token-Efficient Vector Animation Generation via Compact Text Format},
	author={Siva RamaKrishna},
	year={2026},
	url={https://github.com/srk0102/AnimTOON}
	}
	```

	## License

	MIT License - see [LICENSE](https://github.com/srk0102/AnimTOON/blob/master/LICENSE)