Spaces:

lsextractor
/

deepseek-ocr2-api

Running

App Files Files Community

deepseek-ocr2-api / README.md

lsextractor

Deploy deepseek-ai/DeepSeek-OCR-2

74c8e47 verified about 1 month ago

preview code

raw

history blame contribute delete

2.38 kB

	---
	title: DeepSeek OCR-2 API
	emoji: 🔍
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: 4.31.0
	python_version: 3.11
	app_file: app.py
	pinned: false
	license: apache-2.0
	---

	# DeepSeek-OCR-2 Table Structure Recognition API

	High-accuracy OCR and table structure recognition using DeepSeek-OCR-2 (3B parameters).

	## Features

	- 📊 Table Detection & Recognition: Extract complex table structures
	- 📦 Cell-Level Bounding Boxes: Precise coordinates for all cells
	- 📋 Header Detection: Automatic header identification
	- 🔗 Merged Cells: Rowspan/colspan support
	- 🎯 High Accuracy: State-of-the-art performance

	## API Usage

	### Python Client

	```python
	import requests
	import base64

	# Load and encode image
	with open("document.png", "rb") as f:
	image_b64 = base64.b64encode(f.read()).decode()

	# Call API
	response = requests.post(
	"https://your-username-space-name.hf.space/api/predict",
	json={"data": [image_b64]},
	headers={"Authorization": f"Bearer {YOUR_HF_TOKEN}"}
	)

	result = response.json()
	print(result)
	```

	### cURL

	```bash
	curl -X POST https://your-username-space-name.hf.space/api/predict \
	-H "Content-Type: application/json" \
	-H "Authorization: Bearer YOUR_HF_TOKEN" \
	-d '{"data": ["base64_encoded_image"]}'
	```

	## Response Format

	```json
	{
	"status": "success",
	"tables": [
	{
	"bbox": [x1, y1, x2, y2],
	"cells": [
	{
	"row": 0,
	"col": 0,
	"rowSpan": 1,
	"colSpan": 1,
	"bbox": [x1, y1, x2, y2],
	"text": "Cell content"
	}
	],
	"headers": [...],
	"rows": [...]
	}
	],
	"blocks": [...],
	"text": "Extracted text...",
	"metadata": {
	"model": "deepseek-ai/DeepSeek-OCR-2",
	"device": "cuda",
	"image_size": [width, height]
	}
	}
	```

	## Model Info

	- Model: deepseek-ai/DeepSeek-OCR-2
	- Parameters: 3B
	- Precision: FP16
	- GPU: T4 (16GB VRAM)
	- License: Apache-2.0

	## Links

	- [Model on HuggingFace](https://huggingface.co/deepseek-ai/DeepSeek-OCR-2)
	- [Project Repository](https://git.epam.com/epm-gpt/badgerdoc/ls-extractor)
	- [Documentation](https://git.epam.com/epm-gpt/badgerdoc/ls-extractor/-/tree/main/docs)

	## Citation

	```bibtex
	@article{deepseek-ocr-2,
	title={DeepSeek-OCR-2: Advanced Document Understanding},
	author={DeepSeek AI},
	year={2026}
	}
	```