TongkunGuan commited on
Commit
4a119d5
·
verified ·
1 Parent(s): 47e6ad9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -3
README.md CHANGED
@@ -20,13 +20,19 @@ we also devise a high-quality data production pipeline that constructs the first
20
  Furthermore, leveraging this foundation with exceptional image-as-text capability,
21
  we seamlessly replace previous VFMs with TokenOCR to construct a document-level MLLM, `TokenVL`, for VQA-based document understanding tasks.
22
 
23
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64119264f0f81eb569e0d569/o9_FX5D8_NOS1gfnebp5s.png)
24
-
25
  # Token Family
26
 
27
  ## TokenIT
28
 
29
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/650d4a36cbd0c7d550d3b41b/WcQwU3-xjyT5Vm-pZhACo.png)
 
 
 
 
 
 
 
 
30
 
31
 
32
  | VFM | Granularity | Dataset | #Image | #Pairs |
 
20
  Furthermore, leveraging this foundation with exceptional image-as-text capability,
21
  we seamlessly replace previous VFMs with TokenOCR to construct a document-level MLLM, `TokenVL`, for VQA-based document understanding tasks.
22
 
 
 
23
  # Token Family
24
 
25
  ## TokenIT
26
 
27
+ <div align="center">
28
+ <img width="500" alt="image" src="https://cdn-uploads.huggingface.co/production/uploads/650d4a36cbd0c7d550d3b41b/WcQwU3-xjyT5Vm-pZhACo.png">
29
+ </div>
30
+
31
+ <!-- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/650d4a36cbd0c7d550d3b41b/WcQwU3-xjyT5Vm-pZhACo.png) -->
32
+ n overview of the self-constructed token-level TokenIT dataset, comprising 20 million images and 1.8 billion
33
+ text-mask pairs. (a) provides a detailed description of each sample, including the raw image, a mask, and a JSON file that
34
+ records BPE token information. We also count (b) the data distribution, (c) the number of selected BPE tokens, and (d) a
35
+ word cloud map highlighting the top 100 BPE tokens.
36
 
37
 
38
  | VFM | Granularity | Dataset | #Image | #Pairs |