yueyang2000 commited on
Commit
b5762d5
·
verified ·
1 Parent(s): c9ce196

Upload folder using huggingface_hub

Browse files
Files changed (4) hide show
  1. .gitattributes +1 -0
  2. README.md +39 -0
  3. assets/Compare_Recon.png +3 -0
  4. insight_tok.pt +3 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ assets/Compare_Recon.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,42 @@
1
  ---
2
  license: mit
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ tags:
4
+ - discrete tokenization
5
+ - autoregressive generation
6
  ---
7
+ # InsightTok
8
+
9
+ InsightTok is a discrete visual tokenizer designed to improve the fidelity of **text** and **faces**, two of the most challenging yet perceptually important structures in autoregressive image generation.
10
+
11
+ It was introduced in the paper *InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation*.
12
+
13
+ - **Paper:**: [InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation](https://huggingface.co/papers)
14
+ - **Code:** [https://github.com/LeapLabTHU/JustGRPO](https://github.com/LeapLabTHU/JustGRPO)
15
+
16
+ ## Hyperparameters
17
+
18
+ - Downsampling Rate: 16x
19
+ - Codebook Size: 16384
20
+ - Latent Dimension: 256
21
+ - Number of parameters: 426M
22
+
23
+ ## Performance
24
+
25
+ <p align="center">
26
+ <img src="assets/Compare_Recon.png" width="95%">
27
+ </p>
28
+
29
+ ## Usage
30
+
31
+ Please refer to our [GitHub repository](https://github.com/LeapLabTHU/InsightTok).
32
+
33
+ ## Citation
34
+
35
+ ```bibtex
36
+ @article{yue2026insighttok,
37
+ title={InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation},
38
+ author={Yue, Yang and Wei, Fangyun and He, Tianyu and Zhao, Jinjing and Ni, Zanlin and Liu, Zeyu and Guo, Jiayi and Shi, Lei and Dong, Yue and Chen, Li and Li, Ji and Huang, Gao and Chen, Dong},
39
+ journal={arXiv preprint arXiv:TODO},
40
+ year={2026}
41
+ }
42
+ ```
assets/Compare_Recon.png ADDED

Git LFS Details

  • SHA256: 79f4b682d8836c1e447f704caed15a9fcf21d916b721e9b8fccabb53b25ffaac
  • Pointer size: 132 Bytes
  • Size of remote file: 1.73 MB
insight_tok.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bde6a0907097298ac541ff2f9da28b926f68483ea138757064e74abb51887483
3
+ size 1721213019