Running on Zero MCP 395 Multimodal OCR 🍍 395 nanonets ocr2 / olmocr / qwen2vl ocr / aya vision / rolmocr
Running on Zero MCP Featured 140 Multimodal OCR2 💻 140 nanonets ocr / smoldocling / monkey ocr / typhoon ocr
Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing Paper • 2412.19806 • Published Oct 8, 2024 • 2