File size: 1,101 Bytes
6268841 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | # DeepSeek OCR (OCR-1 / OCR-2)
DeepSeek OCR models are multimodal (image + text) models for OCR and document understanding.
## Launch server
```shell
python -m sglang.launch_server \
--model-path deepseek-ai/DeepSeek-OCR-2 \
--trust-remote-code \
--host 0.0.0.0 \
--port 30000
```
> You can replace `deepseek-ai/DeepSeek-OCR-2` with `deepseek-ai/DeepSeek-OCR`.
## Prompt examples
Recommended prompts from the model card:
```
<image>
<|grounding|>Convert the document to markdown.
```
```
<image>
Free OCR.
```
## OpenAI-compatible request example
```python
import requests
url = "http://localhost:30000/v1/chat/completions"
data = {
"model": "deepseek-ai/DeepSeek-OCR-2",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "<image>\n<|grounding|>Convert the document to markdown."},
{"type": "image_url", "image_url": {"url": "https://example.com/your_image.jpg"}},
],
}
],
"max_tokens": 512,
}
response = requests.post(url, json=data)
print(response.text)
```
|