--- title: olmOCR Document OCR (CPU) emoji: 📄 colorFrom: blue colorTo: green sdk: gradio sdk_version: 5.49.1 app_file: app.py python_version: 3.11 pinned: false license: apache-2.0 --- # olmOCR: Document OCR with Vision Language Models (CPU Version) This Space uses the olmOCR model to extract text from PDF and image files, optimized for CPU deployment. ## Features - PDF and image support (PNG, JPEG) - Page-by-page processing for PDFs - Optimized for CPU inference - Free tier deployment ## Performance Notes - Processing time: 30-90 seconds per page on CPU - Image resolution reduced to 1024px for efficiency - Uses greedy decoding for faster inference ## Model Uses `allenai/olmOCR-2-7B-1025` optimized for CPU.