nielsr HF Staff commited on
Commit
5800784
·
verified ·
1 Parent(s): a08409a

Improve model card and add metadata

Browse files

This PR improves the model card by:
- Adding the `image-to-image` pipeline tag for better discoverability.
- Updating the paper link to the specific Hugging Face paper page.
- Adding a sample usage section with the code snippet found in the GitHub README.
- Updating the citation with the correct ArXiv ID (2605.14333).

Files changed (1) hide show
  1. README.md +15 -7
README.md CHANGED
@@ -1,16 +1,18 @@
1
  ---
2
  license: mit
 
3
  tags:
4
  - discrete tokenization
5
  - autoregressive generation
6
  ---
 
7
  # InsightTok
8
 
9
  InsightTok is a discrete visual tokenizer designed to improve the fidelity of **text** and **faces**, two of the most challenging yet perceptually important structures in autoregressive image generation.
10
 
11
- It was introduced in the paper *InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation*.
12
 
13
- - **Paper:**: [InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation](https://huggingface.co/papers)
14
  - **Code:** [https://github.com/LeapLabTHU/InsightTok](https://github.com/LeapLabTHU/InsightTok)
15
 
16
  ## Model Details
@@ -24,8 +26,7 @@ It was introduced in the paper *InsightTok: Improving Text and Face Fidelity in
24
 
25
  ## Performance
26
 
27
- InsightTok achieves strong text and face reconstruction quality while maintaining a compact discrete representation.
28
-
29
 
30
  <p align="center">
31
  <img src="assets/Recon_Plot.png" width="100%">
@@ -37,15 +38,22 @@ InsightTok achieves strong text and face reconstruction quality while maintainin
37
 
38
  ## Usage
39
 
40
- Please refer to our [GitHub repository](https://github.com/LeapLabTHU/InsightTok).
 
 
 
 
 
 
 
41
 
42
  ## Citation
43
 
44
  ```bibtex
45
  @article{yue2026insighttok,
46
  title={InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation},
47
- author={Yue, Yang and Wei, Fangyun and He, Tianyu and Zhao, Jinjing and Ni, Zanlin and Liu, Zeyu and Guo, Jiayi and Shi, Lei and Dong, Yue and Chen, Li and Li, Ji and Huang, Gao and Chen, Dong},
48
- journal={arXiv preprint arXiv:TODO},
49
  year={2026}
50
  }
51
  ```
 
1
  ---
2
  license: mit
3
+ pipeline_tag: image-to-image
4
  tags:
5
  - discrete tokenization
6
  - autoregressive generation
7
  ---
8
+
9
  # InsightTok
10
 
11
  InsightTok is a discrete visual tokenizer designed to improve the fidelity of **text** and **faces**, two of the most challenging yet perceptually important structures in autoregressive image generation.
12
 
13
+ It was introduced in the paper [InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation](https://huggingface.co/papers/2605.14333).
14
 
15
+ - **Paper:** [InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation](https://huggingface.co/papers/2605.14333)
16
  - **Code:** [https://github.com/LeapLabTHU/InsightTok](https://github.com/LeapLabTHU/InsightTok)
17
 
18
  ## Model Details
 
26
 
27
  ## Performance
28
 
29
+ InsightTok achieves strong text and face reconstruction quality while maintaining a compact discrete representation through localized, content-aware perceptual losses.
 
30
 
31
  <p align="center">
32
  <img src="assets/Recon_Plot.png" width="100%">
 
38
 
39
  ## Usage
40
 
41
+ InsightTok follows the standard VQGAN-style autoencoding interface. For setup and implementation details, please refer to the [GitHub repository](https://github.com/LeapLabTHU/InsightTok).
42
+
43
+ ```python
44
+ # image encoding
45
+ latents, _, [_, _, indices] = vq_model.encode(input_image_tensor)
46
+ # image decoding
47
+ recon_image_tensor = vq_model.decode(latents)
48
+ ```
49
 
50
  ## Citation
51
 
52
  ```bibtex
53
  @article{yue2026insighttok,
54
  title={InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation},
55
+ author={Yue, Yang and Wei, Fangyun and He, Tianyu and Zhao, Jinjing and Ni, Zanlin and Liu, Zeyu and Guo, Jiayi and Shi, Lei and Dong, Yue bit and Chen, Li and Li, Ji and Huang, Gao and Chen, Dong},
56
+ journal={arXiv preprint arXiv:2605.14333},
57
  year={2026}
58
  }
59
  ```