xingxm commited on
Commit
fa8aad2
·
verified ·
1 Parent(s): 330ffd9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -51
README.md CHANGED
@@ -20,7 +20,7 @@ model-index:
20
  results: []
21
  ---
22
 
23
- # HiVG-3B-Base
24
 
25
  **HiVG-3B-Base** is a 3B-parameter vision-language model for **autoregressive Scalable Vector Graphics (SVG) generation**. It is the base model from the paper [**"Hierarchical SVG Tokenization: Learning Compact Visual Programs for Scalable Vector Graphics Modeling"**](https://arxiv.org/abs/2604.05072).
26
 
@@ -29,24 +29,40 @@ HiVG introduces a novel **hierarchical SVG tokenization framework** that replace
29
  | 📄 [Paper](https://arxiv.org/abs/2604.05072) | 🏠 [Project Page](https://hy-hivg.github.io/) | 🤗 [Paper Page](https://huggingface.co/papers/2604.05072) |
30
  |---|---|---|
31
 
32
- ## Model Description
33
 
34
- ### Overview
 
 
35
 
36
- Recent large language models have shifted SVG generation from differentiable rendering optimization to autoregressive program synthesis. However, existing approaches still rely on **generic byte-level tokenization** inherited from natural language processing, which poorly reflects the geometric structure of vector graphics — numerical coordinates are fragmented into discrete symbols, destroying spatial relationships and inflating token length and computational cost.
37
 
38
- **HiVG** addresses these fundamental challenges through a hierarchical SVG tokenization framework:
39
 
40
- 1. **Atomic Tokens (Level 1):** Raw SVG strings are decomposed into structured atomic tokens that preserve the full geometric semantics of SVG commands (structure, command type, and coordinates).
41
- 2. **Segment Tokens (Level 2):** Executable command–parameter groups are further compressed into geometry-constrained segment tokens, substantially improving sequence efficiency while preserving syntactic validity.
42
- 3. **Hierarchical Mean-Noise Initialization:** A novel embedding initialization strategy that bridges the gap between pre-trained LLM embeddings and the new SVG token space.
43
- 4. **Curriculum Training Paradigm:** A training strategy that progressively increases SVG program complexity, enabling more stable learning of executable SVG programs.
 
 
 
 
 
 
 
 
 
 
 
44
 
45
- ### Architecture
 
 
 
 
 
46
 
47
- - **Parameters:** ~3B (4B total including vision encoder)
48
- - **Training Strategy:** Full-parameter Supervised Fine-Tuning (SFT) with **frozen vision encoder**
49
- - **Tokenization:** Hierarchical SVG tokenizer (atomic + segment tokens)
50
 
51
  ## Intended Uses
52
 
@@ -74,44 +90,6 @@ Recent large language models have shifted SVG generation from differentiable ren
74
 
75
  Please refer to the [paper](https://arxiv.org/abs/2604.05072) for detailed compute specifications.
76
 
77
- ## Evaluation
78
-
79
- ### Tasks
80
-
81
- The model was evaluated on both:
82
- - **Text-to-SVG** generation
83
- - **Image-to-SVG** generation (vectorization)
84
-
85
- ### Results
86
-
87
- Extensive experiments demonstrate that HiVG improves:
88
- - **Generation fidelity** — higher visual quality of rendered SVGs
89
- - **Spatial consistency** — better preservation of geometric layouts and spatial relationships
90
- - **Sequence efficiency** — significantly shorter token sequences compared to conventional byte-level tokenization schemes
91
-
92
- For detailed quantitative results, tables, and comparisons with baselines (e.g., StarVector, DuetSVG), please refer to the [paper](https://arxiv.org/abs/2604.05072).
93
-
94
- ## How to Use
95
-
96
- ```python
97
- from hivg_infer import HiSVGInferencePipeline
98
-
99
- pipeline = HiSVGInferencePipeline(
100
- model_path="/path/to/model",
101
- coord_range=234,
102
- temperature=0.7,
103
- top_p=0.9,
104
- max_new_tokens=4096,
105
- )
106
-
107
- # Image-to-SVG
108
- result = pipeline.img2svg("assets/cases/w2.png")
109
- if result["success"]:
110
- print(result["svg"])
111
- ```
112
-
113
- > Note: For detailed inference code, data preprocessing, and the hierarchical SVG tokenizer/detokenizer, please visit the [project page](https://hy-hivg.github.io/) and the associated code repository.
114
-
115
  ## Citation
116
 
117
  If you find this work helpful, please cite:
 
20
  results: []
21
  ---
22
 
23
+ # HiVG: Hierarchical SVG Tokenization
24
 
25
  **HiVG-3B-Base** is a 3B-parameter vision-language model for **autoregressive Scalable Vector Graphics (SVG) generation**. It is the base model from the paper [**"Hierarchical SVG Tokenization: Learning Compact Visual Programs for Scalable Vector Graphics Modeling"**](https://arxiv.org/abs/2604.05072).
26
 
 
29
  | 📄 [Paper](https://arxiv.org/abs/2604.05072) | 🏠 [Project Page](https://hy-hivg.github.io/) | 🤗 [Paper Page](https://huggingface.co/papers/2604.05072) |
30
  |---|---|---|
31
 
32
+ ## Highlights
33
 
34
+ - **Small Model, Frontier Results** — 3B parameters that beat 7/7 proprietary models including GPT-5 and Gemini 2.5 on image-to-SVG.
35
+ - **Efficient SVG Token Compression** — Hierarchical tokenization (Raw SVG → Atomic tokens → Segment tokens) with 2.76x sequence compression.
36
+ - **High-Fidelity Image-to-SVG** — Convert any image into a clean, editable SVG — structure, layout, and detail faithfully preserved.
37
 
38
+ ## Quick Start
39
 
40
+ You can use the provided inference pipeline for both image-to-SVG and text-to-SVG tasks.
41
 
42
+ ```python
43
+ from hivg_infer import HiSVGInferencePipeline
44
+
45
+ pipeline = HiSVGInferencePipeline(
46
+ model_path="xingxm/HiVG-3B-Base",
47
+ coord_range=234,
48
+ temperature=0.7,
49
+ top_p=0.9,
50
+ max_new_tokens=4096,
51
+ )
52
+
53
+ # Image-to-SVG
54
+ result = pipeline.img2svg("path/to/your_image.png")
55
+ if result["success"]:
56
+ print(result["svg"])
57
 
58
+ # Text-to-SVG
59
+ result = pipeline.text2svg("A minimalist black phone icon with an outline style")
60
+ if result["success"]:
61
+ with open("output.svg", "w") as f:
62
+ f.write(result["svg"])
63
+ ```
64
 
65
+ > Note: For detailed inference code, data preprocessing, and the hierarchical SVG tokenizer/detokenizer, please visit the [project page](https://hy-hivg.github.io/) and the associated code repository.
 
 
66
 
67
  ## Intended Uses
68
 
 
90
 
91
  Please refer to the [paper](https://arxiv.org/abs/2604.05072) for detailed compute specifications.
92
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
93
  ## Citation
94
 
95
  If you find this work helpful, please cite: