Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K Context Paper • 2605.13831 • Published May 13 • 88
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations Paper • 2412.07626 • Published Dec 10, 2024 • 30
Qwen/Qwen3-Coder-480B-A35B-Instruct Text Generation • 480B • Updated Aug 21, 2025 • 43.5k • • 1.34k
meta-llama/Llama-3.2-11B-Vision Image-Text-to-Text • 11B • Updated Sep 27, 2024 • 6.68k • 589
meta-llama/Llama-3.2-11B-Vision-Instruct Image-Text-to-Text • 11B • Updated Dec 4, 2024 • 102k • 1.61k
meta-llama/Llama-3.2-90B-Vision-Instruct Image-Text-to-Text • 89B • Updated Mar 4, 2025 • 694 • 359
Zephyr 7B Collection Models, datasets, and demos associated with Zephyr 7B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 8 items • Updated Mar 2 • 153