openbmb
/

VisRAG-Ret

Feature Extraction

Model card Files Files and versions

tcy6 commited on Oct 23, 2024

Commit

a932f2e

·

verified ·

1 Parent(s): 2069e2c

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -41,7 +41,8 @@ In the paper, We use MiniCPM-V 2.0, MiniCPM-V 2.6 and GPT-4o as the generators.
 ## Training
 ### VisRAG-Ret
-Our training dataset of 362,110 Query-Document (Q-D) Pairs for **VisRAG-Ret** is comprised of train sets of openly available academic datasets (34%) and a synthetic dataset made up of pages from web-crawled PDF documents and augmented with VLM-generated (GPT-4o) pseudo-queries (66%).
 ### VisRAG-Gen
 The generation part does not use any fine-tuning; we directly use off-the-shelf LLMs/VLMs for generation.

 ## Training
 ### VisRAG-Ret
+Our training dataset of 362,110 Query-Document (Q-D) Pairs for **VisRAG-Ret** is comprised of train sets of openly available academic datasets (34%) and a synthetic dataset made up of pages from web-crawled PDF documents and augmented with VLM-generated (GPT-4o) pseudo-queries (66%). It can be found in the `VisRAG` Collection on Hugging Face, which is referenced at the beginning of this page.
 ### VisRAG-Gen
 The generation part does not use any fine-tuning; we directly use off-the-shelf LLMs/VLMs for generation.