Instructions to use Caffin/SVGThinker-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Caffin/SVGThinker-7B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Caffin/SVGThinker-7B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Caffin/SVGThinker-7B") model = AutoModelForCausalLM.from_pretrained("Caffin/SVGThinker-7B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Caffin/SVGThinker-7B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Caffin/SVGThinker-7B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Caffin/SVGThinker-7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Caffin/SVGThinker-7B
- SGLang
How to use Caffin/SVGThinker-7B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Caffin/SVGThinker-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Caffin/SVGThinker-7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Caffin/SVGThinker-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Caffin/SVGThinker-7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Caffin/SVGThinker-7B with Docker Model Runner:
docker model run hf.co/Caffin/SVGThinker-7B
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Caffin/SVGThinker-7B")
model = AutoModelForCausalLM.from_pretrained("Caffin/SVGThinker-7B")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))SVGThinker-7B
SVGThinker-7B is a text-to-SVG generation model introduced in SVGThinker: Instruction-Aligned and Reasoning-Driven Text-to-SVG Generation. It generates editable SVG code from natural-language descriptions, with a focus on compact icon-style vector graphics.
The model is fine-tuned from
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
and is released as BF16 sharded safetensors.
Links
- Paper: arXiv:2509.24299
- Demo Space: Caffin/SVGThinker-7B
- License: MIT
Intended Use
This model is intended for:
- generating SVG icons from English text prompts
- prototyping simple vector graphics
- producing editable SVG markup rather than raster images
- research on text-to-SVG and structured code generation
Generated SVG should be reviewed and sanitized before being rendered in production web pages or downstream applications.
Quick Start
pip install "transformers>=4.51.0" torch accelerate safetensors
import re
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "Caffin/SVGThinker-7B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
use_safetensors=True,
)
description = "A minimalist calendar icon with two black tabs and a bold checkmark below it."
prompt = "Review the given information below and generate a svg according to it.\n" + description
messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt",
).to(model.device)
with torch.no_grad():
output = model.generate(
inputs,
max_new_tokens=4096,
do_sample=True,
temperature=0.8,
top_p=0.6,
repetition_penalty=1.05,
)
text = tokenizer.decode(output[0], skip_special_tokens=False)
match = re.search(r"<svg.*?</svg>", text, flags=re.DOTALL)
print(match.group(0) if match else text)
The model may emit reasoning text before the final SVG. For most applications,
extract the <svg>...</svg> block before rendering.
Model Notes
SVGThinker is trained directly in SVG code space. The paper describes a sequential annotation pipeline that aligns natural-language descriptions with the step-by-step construction of SVG primitives, helping the model generate more editable SVG code.
For full training data, annotation, and evaluation details, see the paper.
Evaluation Snapshot
On the paper's 1,000 held-out text-to-SVG prompts, SVGThinker reports:
| Model | FID lower is better | CLIP higher is better | FID-CLIP lower is better | Primitive support |
|---|---|---|---|---|
| SVGThinker-7B | 34.06 | 0.2765 | 21.08 | all |
Limitations
- Outputs may be malformed, incomplete, or visually inconsistent with the prompt.
- The model is best suited for simple to moderately complex icon-style graphics.
- It may struggle with photorealistic scenes, dense layouts, and text-heavy SVGs.
- SVG is executable markup in browser contexts; treat generated SVG as untrusted.
- The model primarily targets English prompts.
Citation
@inproceedings{chen2025svgthinker,
title = {SVGThinker: Instruction-Aligned and Reasoning-Driven Text-to-SVG Generation},
author = {Chen, Hanqi and Zhao, Zhongyin and Chen, Ye and Liang, Zhujin and Ni, Bingbing},
booktitle = {Proceedings of the 33rd ACM International Conference on Multimedia},
year = {2025},
publisher = {ACM},
doi = {10.1145/3746027.3755392},
url = {https://arxiv.org/abs/2509.24299}
}
- Downloads last month
- 11
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Caffin/SVGThinker-7B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)