metadata
pipeline_tag: text-to-image
tags:
- text-to-image
- svg
- vector-graphics
license: mit
Diffsketcher - Vector Graphics Generation
This model generates vector graphics (SVG) from text prompts. It uses the original implementation from the official repository.
Model Description
DiffSketcher generates vector graphics (SVG) from text prompts. It uses a diffusion model to guide the SVG generation and creates sketches with a specified number of paths.
Usage
import requests
API_URL = "https://api-inference.huggingface.co/models/jree423/diffsketcher"
headers = {"Authorization": "Bearer YOUR_API_TOKEN"}
def query(prompt):
response = requests.post(API_URL, headers=headers, json={"inputs": prompt})
return response.content # Returns the image directly
# Generate an image
with open("output.png", "wb") as f:
f.write(query("a beautiful mountain landscape"))
Examples
- "a beautiful mountain landscape"
- "a red sports car"
- "a portrait of a woman"
- "a cat playing with a ball"
How It Works
- Text Encoding: The text prompt is encoded using CLIP.
- Diffusion Process: A diffusion model generates a latent representation.
- SVG Generation: The latent representation is used to generate an SVG.
- PNG Conversion: The SVG is converted to PNG for display.
Performance Considerations
- The original implementation requires significant computational resources
- Generation can take several minutes depending on the complexity
- GPU acceleration is recommended for optimal performance
Citation
@article{xing2023diffsketcher,
title={{DiffSketcher}: Text Guided Vector Sketch Synthesis through Latent Diffusion Models},
author={Xing, XiMing and Zhan, Chuang and Xu, Yinghao and Dong, Yue and Yu, Yingqing and Li, Chongyang and Liu, Yongyi and Ma, Chongxuan and Tao, Dacheng},
journal={arXiv preprint arXiv:2306.14685},
year={2023}
}
License
This model is licensed under the MIT License.