File size: 725 Bytes
49129f9
1c1eb03
 
 
 
49129f9
55a0a6c
49129f9
1c1eb03
49129f9
1c1eb03
49129f9
 
1c1eb03
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
---
title: olmOCR Document OCR (CPU)
emoji: πŸ“„
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
python_version: 3.11
pinned: false
license: apache-2.0
---

# olmOCR: Document OCR with Vision Language Models (CPU Version)

This Space uses the olmOCR model to extract text from PDF and image files, optimized for CPU deployment.

## Features
- PDF and image support (PNG, JPEG)
- Page-by-page processing for PDFs
- Optimized for CPU inference
- Free tier deployment

## Performance Notes
- Processing time: 30-90 seconds per page on CPU
- Image resolution reduced to 1024px for efficiency
- Uses greedy decoding for faster inference

## Model
Uses `allenai/olmOCR-2-7B-1025` optimized for CPU.