view article Article DS-MoE: Making MoE Models More Efficient and Less Memory-Intensive Apr 9, 2024 • 30
MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer Paper • 2509.16197 • Published Sep 19 • 56
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action Paper • 2312.17172 • Published Dec 28, 2023 • 30
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community +1 Apr 15, 2024 • 191
view article Article Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset +1 Mar 15, 2024 • 13
RoRF Jina Collection RoRF (Routing on Random Forests) trained on Jina AI's SOTA open-source embeddings • 6 items • Updated Sep 24, 2024 • 1
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft Paper • 2504.08388 • Published Apr 11 • 42
LLM papers Collection It is a collection of papers that are useful in studying LLM. • 14 items • Updated Apr 3, 2024 • 15
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation Paper • 2507.10524 • Published Jul 14 • 70