File size: 2,382 Bytes
1504305
74c8e47
 
 
1504305
 
74c8e47
 
1504305
 
74c8e47
1504305
 
74c8e47
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
---
title: DeepSeek OCR-2 API
emoji: πŸ”
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.31.0
python_version: 3.11
app_file: app.py
pinned: false
license: apache-2.0
---

# DeepSeek-OCR-2 Table Structure Recognition API

High-accuracy OCR and table structure recognition using DeepSeek-OCR-2 (3B parameters).

## Features

- πŸ“Š **Table Detection & Recognition**: Extract complex table structures
- πŸ“¦ **Cell-Level Bounding Boxes**: Precise coordinates for all cells
- πŸ“‹ **Header Detection**: Automatic header identification
- πŸ”— **Merged Cells**: Rowspan/colspan support
- 🎯 **High Accuracy**: State-of-the-art performance

## API Usage

### Python Client

```python
import requests
import base64

# Load and encode image
with open("document.png", "rb") as f:
    image_b64 = base64.b64encode(f.read()).decode()

# Call API
response = requests.post(
    "https://your-username-space-name.hf.space/api/predict",
    json={"data": [image_b64]},
    headers={"Authorization": f"Bearer {YOUR_HF_TOKEN}"}
)

result = response.json()
print(result)
```

### cURL

```bash
curl -X POST https://your-username-space-name.hf.space/api/predict \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_HF_TOKEN" \
  -d '{"data": ["base64_encoded_image"]}'
```

## Response Format

```json
{
  "status": "success",
  "tables": [
    {
      "bbox": [x1, y1, x2, y2],
      "cells": [
        {
          "row": 0,
          "col": 0,
          "rowSpan": 1,
          "colSpan": 1,
          "bbox": [x1, y1, x2, y2],
          "text": "Cell content"
        }
      ],
      "headers": [...],
      "rows": [...]
    }
  ],
  "blocks": [...],
  "text": "Extracted text...",
  "metadata": {
    "model": "deepseek-ai/DeepSeek-OCR-2",
    "device": "cuda",
    "image_size": [width, height]
  }
}
```

## Model Info

- **Model:** deepseek-ai/DeepSeek-OCR-2
- **Parameters:** 3B
- **Precision:** FP16
- **GPU:** T4 (16GB VRAM)
- **License:** Apache-2.0

## Links

- [Model on HuggingFace](https://huggingface.co/deepseek-ai/DeepSeek-OCR-2)
- [Project Repository](https://git.epam.com/epm-gpt/badgerdoc/ls-extractor)
- [Documentation](https://git.epam.com/epm-gpt/badgerdoc/ls-extractor/-/tree/main/docs)

## Citation

```bibtex
@article{deepseek-ocr-2,
  title={DeepSeek-OCR-2: Advanced Document Understanding},
  author={DeepSeek AI},
  year={2026}
}
```