yueyang2000 commited on
Commit
a08409a
·
verified ·
1 Parent(s): b5762d5

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. .gitattributes +1 -0
  2. README.md +16 -7
  3. assets/Recon_Plot.png +3 -0
.gitattributes CHANGED
@@ -34,3 +34,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  assets/Compare_Recon.png filter=lfs diff=lfs merge=lfs -text
 
 
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  assets/Compare_Recon.png filter=lfs diff=lfs merge=lfs -text
37
+ assets/Recon_Plot.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -11,19 +11,28 @@ InsightTok is a discrete visual tokenizer designed to improve the fidelity of **
11
  It was introduced in the paper *InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation*.
12
 
13
  - **Paper:**: [InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation](https://huggingface.co/papers)
14
- - **Code:** [https://github.com/LeapLabTHU/JustGRPO](https://github.com/LeapLabTHU/JustGRPO)
15
 
16
- ## Hyperparameters
17
 
18
- - Downsampling Rate: 16x
19
- - Codebook Size: 16384
20
- - Latent Dimension: 256
21
- - Number of parameters: 426M
 
 
22
 
23
  ## Performance
24
 
 
 
 
 
 
 
 
25
  <p align="center">
26
- <img src="assets/Compare_Recon.png" width="95%">
27
  </p>
28
 
29
  ## Usage
 
11
  It was introduced in the paper *InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation*.
12
 
13
  - **Paper:**: [InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation](https://huggingface.co/papers)
14
+ - **Code:** [https://github.com/LeapLabTHU/InsightTok](https://github.com/LeapLabTHU/InsightTok)
15
 
16
+ ## Model Details
17
 
18
+ | Property | Value |
19
+ |---|---:|
20
+ | Downsampling rate | 16× |
21
+ | Codebook size | 16,384 |
22
+ | Latent dimension | 256 |
23
+ | Number of parameters | 426M |
24
 
25
  ## Performance
26
 
27
+ InsightTok achieves strong text and face reconstruction quality while maintaining a compact discrete representation.
28
+
29
+
30
+ <p align="center">
31
+ <img src="assets/Recon_Plot.png" width="100%">
32
+ </p>
33
+
34
  <p align="center">
35
+ <img src="assets/Compare_Recon.png" width="100%">
36
  </p>
37
 
38
  ## Usage
assets/Recon_Plot.png ADDED

Git LFS Details

  • SHA256: 499c0e896943c4f9b8631a1de06cbf65cd0a6362700267886b5e0f1cc48b5e85
  • Pointer size: 131 Bytes
  • Size of remote file: 114 kB