Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
Bk9x 's Collections
Data_Pretrain_NLP
Dataset_NLP
Small LM
Dataset_voice
Embedding
Automatic Speech Recognition
SDXL
TTS
LLM
model_NLP
VLM + OCR

VLM + OCR

updated Mar 19
Upvote
-

  • 5CD-AI/Vintern-1B-v2

    Image-Text-to-Text • 0.9B • Updated Jan 17, 2025 • 507 • 81

  • erax-ai/EraX-VL-7B-V1.0

    Image-Text-to-Text • 8B • Updated Jan 15, 2025 • 112 • 43

  • Running on Zero
    Agents
    Featured
    276

    granite-docling-258M demo

    📝
    276

    Convert images of documents to structured data and answer queries


  • datalab-to/chandra

    Image-Text-to-Text • 9B • Updated Mar 26 • 113k • 522

  • deepseek-ai/DeepSeek-OCR

    Image-Text-to-Text • 3B • Updated Nov 4, 2025 • 2.97M • 3.24k

  • Running on Zero
    MCP
    69

    Multimodal OCR3

    🌖
    69

    Chandra-OCR / Nanonets-OCR2 / olmOCR-2 / Dots.OCR


  • lightonai/LightOnOCR-2-1B

    Image-Text-to-Text • 1B • Updated 12 days ago • 600k • 684

  • HuggingFaceFW/finepdfs

    Viewer • Updated Apr 3 • 476M • 60.1k • 860

  • baidu/Qianfan-OCR

    Image-Text-to-Text • 5B • Updated 17 days ago • 443k • 1.17k

    Note 4B direct image-to-Markdown conversion and supports a broad range of prompt-driven tasks — from structured document parsing and table extraction to chart understanding, document question answering, and key information extraction

Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs