---
license: apache-2.0
tags:
- text-to-image
- diffusion
- latent-diffusion
- visual-foundation-model
- representation-learning
- dino
- svg
pipeline_tag: text-to-image
library_name: pytorch
language:
- en
---
_**[Minglei Shi](https://github.com/shiml20)
1\*, [Haolin Wang](https://howlin-wang.github.io)
1\*, [Borui Zhang](https://boruizhang.site/)
1, [Wenzhao Zheng](https://wzzheng.net)
1, [Bohan Zeng](https://scholar.google.com/citations?user=MHo_d3YAAAAJ&hl=en)
2**_
_**[Ziyang Yuan](https://scholar.google.ru/citations?user=fWxWEzsAAAAJ&hl=en)
2†, [Xiaoshi Wu](https://scholar.google.com/citations?user=cnOAMbUAAAAJ&hl=en)
2, [Yuanxing Zhang](https://scholar.google.com/citations?user=COdftTMAAAAJ&hl=en)
2, [Huan Yang](https://hyang0511.github.io/)
2**_
_**[Xintao Wang](https://xinntao.github.io/)
2, [Pengfei Wan](https://magicwpf.github.io/)
2, [Kun Gai](https://scholar.google.com/citations?user=PXO4ygEAAAAJ&hl=zh-CN)
2, [Jie Zhou](https://scholar.google.com/citations?user=6a79aPwAAAAJ&hl=en)
1, [Jiwen Lu](https://ivg.au.tsinghua.edu.cn/Jiwen_Lu/)
1†**_
1Tsinghua University
2KlingTeam, Kuaishou Technology
\* Equal contribution † Corresponding author
---
> **Important Note:** This repository implements SVG-T2I, a text-to-image diffusion framework that performs visual generation directly in Visual Foundation Model (VFM) representation space, rather than pixel space or vae space.
>
---
## 📰 News
- **[2025-12-13]** 📢✨ We are excited to announce the official release of **SVG-T2I**, including pre-trained checkpoints as well as complete training and inference code.
## 🖼️ Gallery