Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems Paper • 2504.01990 • Published Mar 31, 2025 • 301
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models +1 Jun 24, 2024 • 205
view article Article Multimodal Augmentation for Documents: Recovering “Comprehension” in “Reading and Comprehension” task May 16, 2024 • 17
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites Paper • 2404.16821 • Published Apr 25, 2024 • 58
Advancing LLM Reasoning Generalists with Preference Trees Paper • 2404.02078 • Published Apr 2, 2024 • 46
Self-Discover: Large Language Models Self-Compose Reasoning Structures Paper • 2402.03620 • Published Feb 6, 2024 • 117