The Hidden Gallery

community

https://github.com/The-Hidden-Gallery

Activity Feed Request to join this org

AI & ML interests

AR, Art, Technology

de-Rodrigo

authored a paper 4 months ago

VERSE: Visual Embedding Reduction and Space Exploration. Clustering-Guided Insights for Training Data Enhancement in Visually-Rich Document Understanding

Paper • 2601.05125 • Published Jan 8 • 2

de-Rodrigo

posted an update 4 months ago

Post

1774

We are happy to share the VERSE Methodology paper via arXiv! 📃💫

VERSE: Visual Embedding Reduction and Space Exploration. Clustering-Guided Insights for Training Data Enhancement in Visually-Rich Document Understanding (2601.05125)

We usually train VLMs on visual synthetic data that we (as humans) label as photorealistic. We argue that this is an anthropocentric perspective imposed to a model that might not synthetize visual information as we do. VERSE helps to visualize latent space and overlay visual features to detect poor-performance regions and take action to include better-suited training sets to boost model performance.

Resources:

- Code: https://github.com/nachoDRT/VrDU-Doctor
- Hugging Face Space: de-Rodrigo/Embeddings

Want to collaborate? Do you have any feedback? 🧐

PD: As always, we are grateful to Hugging Face 🤗 for providing the fantastic tools and resources we find on the platform!

de-Rodrigo

submitted a paper to Daily Papers 4 months ago

VERSE: Visual Embedding Reduction and Space Exploration. Clustering-Guided Insights for Training Data Enhancement in Visually-Rich Document Understanding

Paper • 2601.05125 • Published Jan 8 • 2

de-Rodrigo

posted an update over 1 year ago

Post

876

MERIT Dataset 🎒📃🏆 Updates: The Token Classification Version is Now Live on the Hub!

This new version extends the previous dataset by providing richer labels that include word bounding boxes alongside the already available images. 🚀

We can't wait to see how you use this update! Give it a try, and let us know your thoughts, questions, or any cool projects you build with it. 💡

Resources:

- Dataset: de-Rodrigo/merit
- Code and generation pipeline: https://github.com/nachoDRT/MERIT-Dataset
- Paper: The MERIT Dataset: Modelling and Efficiently Rendering Interpretable Transcripts (2409.00447)

de-Rodrigo

posted an update over 1 year ago

Post

1407

A few weeks ago, we uploaded the MERIT Dataset 🎒📃🏆 into Hugging Face 🤗!

Now, we are excited to share the Merit Dataset paper via arXiv! 📃💫
The MERIT Dataset: Modelling and Efficiently Rendering Interpretable Transcripts (2409.00447)

The MERIT Dataset is a fully synthetic, labeled dataset created for training and benchmarking LLMs on Visually Rich Document Understanding tasks. It is also designed to help detect biases and improve interpretability in LLMs, where we are actively working. 🔧🔨

MERIT contains synthetically rendered students' transcripts of records from different schools in English and Spanish. We plan to expand the dataset into different contexts (synth medical/insurance documents, synth IDS, etc.) Want to collaborate? Do you have any feedback? 🧐

Resources:

- Dataset: de-Rodrigo/merit
- Code and generation pipeline: https://github.com/nachoDRT/MERIT-Dataset

PD: We are grateful to Hugging Face 🤗 for providing the fantastic tools and resources we find in the platform and, more specifically, to @nielsr for sharing the fine-tuning/inference scripts we have used in our benchmark.

AI & ML interests

Team members 1

TheHiddenGallery's activity