lusxvr commited on
Commit
6342005
·
1 Parent(s): af61be0
Files changed (1) hide show
  1. app/src/content/article.mdx +2 -0
app/src/content/article.mdx CHANGED
@@ -39,6 +39,7 @@ Even though open-weights Vision-Language Models (VLMs) are becoming ever more po
39
  ### Data Collection
40
  We manually collect over 180 image-text datasets from the recent literature and create new subsets in lacking domains.
41
 
 
42
  <Accordion title="FineVision Subsets">
43
  |Subset Name |Total Images|Total Samples|Total Turns|Total Question Tokens|Total Answer Tokens|Category |
44
  |--------------------------------------|------------|-------------|-----------|---------------------|-------------------|----------------------|
@@ -228,6 +229,7 @@ We manually collect over 180 image-text datasets from the recent literature and
228
  |text_wizardlm_evol |0 |69,999 |69,999 |7,753,963 |21,955,856 |Text-only |
229
  |text_OpenMathInstruct-2 |0 |1,000,000 |1,000,000 |74,905,850 |413,132,418 |Text-only |
230
  </Accordion>
 
231
 
232
  ### Cleaning
233
  After gathering all the sub-datasets, every turn is cleaned. We remove all individual turns whose combined question and answer length exceeds 8192 tokens. We resize big images to have a longest side of 2048 pixels while keeping the aspect ratio, and discard images with corrupted metadata. This results in a clean final dataset with a maximum turn length of 8192 tokens and a maximum image dimension of 2048 pixels on the longest side.
 
39
  ### Data Collection
40
  We manually collect over 180 image-text datasets from the recent literature and create new subsets in lacking domains.
41
 
42
+ <Wide>
43
  <Accordion title="FineVision Subsets">
44
  |Subset Name |Total Images|Total Samples|Total Turns|Total Question Tokens|Total Answer Tokens|Category |
45
  |--------------------------------------|------------|-------------|-----------|---------------------|-------------------|----------------------|
 
229
  |text_wizardlm_evol |0 |69,999 |69,999 |7,753,963 |21,955,856 |Text-only |
230
  |text_OpenMathInstruct-2 |0 |1,000,000 |1,000,000 |74,905,850 |413,132,418 |Text-only |
231
  </Accordion>
232
+ </Wide>
233
 
234
  ### Cleaning
235
  After gathering all the sub-datasets, every turn is cleaned. We remove all individual turns whose combined question and answer length exceeds 8192 tokens. We resize big images to have a longest side of 2048 pixels while keeping the aspect ratio, and discard images with corrupted metadata. This results in a clean final dataset with a maximum turn length of 8192 tokens and a maximum image dimension of 2048 pixels on the longest side.