view article Article CodeAgents + Structure: A Better Way to Execute Actions akseljoonas, m-ric • May 28, 2025 • 82
olmOCR Collection olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org • 12 items • Updated Dec 23, 2025 • 150
view article Article Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H Hcompany • Jun 3, 2025 • 71
view article Article Visual Document Retrieval Goes Multilingual marco, cheesyFishes • Jan 10, 2025 • 78
view article Article Preference Optimization for Vision Language Models +2 qgallouedec, vwxyzjn, merve, kashif • Jul 10, 2024 • 93
view article Article Decoding GPT-4'o': In-Depth Exploration of Its Mechanisms and Creating Similar AI. KingNish • May 21, 2024 • 35
The Rise and Potential of Large Language Model Based Agents: A Survey Paper • 2309.07864 • Published Sep 14, 2023 • 8
view article Article Welcome Gemma 2 - Google’s new open LLM +4 philschmid, osanseviero, pcuenq, lewtun, tomaarsen, reach-vb • Jun 27, 2024 • 132
view article Article How to generate text: using different decoding methods for language generation with Transformers patrickvonplaten • Mar 1, 2020 • 297