TongkunGuan
/

TokenFD

Model card Files Files and versions

TongkunGuan commited on Feb 21, 2025

Commit

4a119d5

·

verified ·

1 Parent(s): 47e6ad9

Update README.md

Files changed (1) hide show

README.md +9 -3

README.md CHANGED Viewed

@@ -20,13 +20,19 @@ we also devise a high-quality data production pipeline that constructs the first
 Furthermore, leveraging this foundation with exceptional image-as-text capability,
 we seamlessly replace previous VFMs with TokenOCR to construct a document-level MLLM, `TokenVL`, for VQA-based document understanding tasks.
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/64119264f0f81eb569e0d569/o9_FX5D8_NOS1gfnebp5s.png)
 # Token Family
 ## TokenIT
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/650d4a36cbd0c7d550d3b41b/WcQwU3-xjyT5Vm-pZhACo.png)
 | VFM                | Granularity | Dataset  | #Image | #Pairs |

 Furthermore, leveraging this foundation with exceptional image-as-text capability,
 we seamlessly replace previous VFMs with TokenOCR to construct a document-level MLLM, `TokenVL`, for VQA-based document understanding tasks.
 # Token Family
 ## TokenIT
+<div align="center">
+  <img width="500" alt="image" src="https://cdn-uploads.huggingface.co/production/uploads/650d4a36cbd0c7d550d3b41b/WcQwU3-xjyT5Vm-pZhACo.png">
+</div>
+<!-- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/650d4a36cbd0c7d550d3b41b/WcQwU3-xjyT5Vm-pZhACo.png) -->
+n overview of the self-constructed token-level TokenIT dataset, comprising 20 million images and 1.8 billion
+text-mask pairs. (a) provides a detailed description of each sample, including the raw image, a mask, and a JSON file that
+records BPE token information. We also count (b) the data distribution, (c) the number of selected BPE tokens, and (d) a
+word cloud map highlighting the top 100 BPE tokens.
 | VFM                | Granularity | Dataset  | #Image | #Pairs |