French OCR datasets - a lbourdois Collection

lbourdois 's Collections

French Courses Translations

French paraphrase dataset

French summarization datasets

French prompts datasets

French DPO and conversation datasets

French think and toolcalling datasets

French embedding datasets

French VQA datasets

French caption datasets

French OCR datasets

French retriever datasets

French table-to-text datasets

French audio datasets (pretraining)

French OCR datasets

updated Feb 8

Datasets I cleaned with an image, a prompt question (like "transcribe the text in this image") and an answer. Can be used to train VLMs.