Spaces:

lsextractor
/

deepseek-ocr2-api

Running

App Files Files Community

deepseek-ocr2-api / README.md

lsextractor

Deploy deepseek-ai/DeepSeek-OCR-2

74c8e47 verified about 1 month ago

preview code

raw

history blame contribute delete

2.38 kB

A newer version of the Gradio SDK is available: 6.9.0

Upgrade

metadata

title: DeepSeek OCR-2 API
emoji: 🔍
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.31.0
python_version: 3.11
app_file: app.py
pinned: false
license: apache-2.0

DeepSeek-OCR-2 Table Structure Recognition API

High-accuracy OCR and table structure recognition using DeepSeek-OCR-2 (3B parameters).

Features

📊 Table Detection & Recognition: Extract complex table structures
📦 Cell-Level Bounding Boxes: Precise coordinates for all cells
📋 Header Detection: Automatic header identification
🔗 Merged Cells: Rowspan/colspan support
🎯 High Accuracy: State-of-the-art performance

API Usage

Python Client

import requests
import base64

# Load and encode image
with open("document.png", "rb") as f:
    image_b64 = base64.b64encode(f.read()).decode()

# Call API
response = requests.post(
    "https://your-username-space-name.hf.space/api/predict",
    json={"data": [image_b64]},
    headers={"Authorization": f"Bearer {YOUR_HF_TOKEN}"}
)

result = response.json()
print(result)

cURL

curl -X POST https://your-username-space-name.hf.space/api/predict \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_HF_TOKEN" \
  -d '{"data": ["base64_encoded_image"]}'

Response Format

{
  "status": "success",
  "tables": [
    {
      "bbox": [x1, y1, x2, y2],
      "cells": [
        {
          "row": 0,
          "col": 0,
          "rowSpan": 1,
          "colSpan": 1,
          "bbox": [x1, y1, x2, y2],
          "text": "Cell content"
        }
      ],
      "headers": [...],
      "rows": [...]
    }
  ],
  "blocks": [...],
  "text": "Extracted text...",
  "metadata": {
    "model": "deepseek-ai/DeepSeek-OCR-2",
    "device": "cuda",
    "image_size": [width, height]
  }
}

Model Info

Model: deepseek-ai/DeepSeek-OCR-2
Parameters: 3B
Precision: FP16
GPU: T4 (16GB VRAM)
License: Apache-2.0

Citation

@article{deepseek-ocr-2,
  title={DeepSeek-OCR-2: Advanced Document Understanding},
  author={DeepSeek AI},
  year={2026}
}