File size: 1,144 Bytes
d56c973
dc23f92
 
 
 
 
 
 
ad5d213
 
dc23f92
ad5d213
dc23f92
ad5d213
dc23f92
ad5d213
dc23f92
ad5d213
dc23f92
 
 
 
ad5d213
dc23f92
ad5d213
dc23f92
 
 
 
 
ad5d213
dc23f92
ad5d213
 
dc23f92
 
 
ad5d213
dc23f92
 
 
ad5d213
 
dc23f92
ad5d213
dc23f92
ad5d213
dc23f92
ad5d213
dc23f92
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
---
title: Unified Document Extraction API
emoji: πŸ“„
colorFrom: blue
colorTo: indigo
sdk: docker
app_file: app.py
pinned: false
---

# πŸš€ Unified Document Extraction API

**One API, Two Engines: Docling + DocStrange**

Extract structured data from any document using AI-powered engines.

## Features

- βœ… **Docling** - Advanced document parsing with structure preservation
- βœ… **DocStrange** - GPU-accelerated intelligent document processing
- βœ… **Multiple formats** - PDF, DOCX, XLSX, PPTX, Images, and more
- βœ… **Structured output** - Markdown, JSON, Tables

## API Endpoints

- `GET /` - Health check
- `GET /engines` - List available engines
- `POST /convert` - Full document conversion
- `POST /convert/markdown` - Markdown only
- `POST /convert/tables` - Tables only

## Usage

```bash
# Convert with Docling
curl -X POST "https://YOUR_SPACE.hf.space/convert?engine=docling" \
  -F "file=@document.pdf"

# Convert with DocStrange
curl -X POST "https://YOUR_SPACE.hf.space/convert?engine=docstrange" \
  -F "file=@document.pdf"
```

## Integration

Works with **DataSync** application for ERPNext integration.

## License

MIT