Made a Demo for people who want to quickly try the model

#2
by ysong21 - opened

Hello! I had some extra GPU credits that are near expiration so I made this quick demo for DeepSeek-OCR-2.

Performance is amazing in my testing. Hope you guys enjoy the model as much as I do 😀

Demo: https://deepseek-ocr-v2-demo.vercel.app

What do you guys think?

这个好慢,执行不了

这个好慢,执行不了

sorry!我改了一下后端,增加了并发和GPU数量。请再试一下🙏

Can i get your config? What platform are u using to deploy?

目前部署的模型结果异常,使用\n<|grounding|>OCR this image.这个提示词,一个是无限复读,其次坐标不准,目前的表现是弱于之前的deepseek-ocr的

Can i get your config? What platform are u using to deploy?

I am using vllm for inference and deployed on Modal (because I have credits there). I basically wrapped the official vllm script in a fast api.

目前部署的模型结果异常,使用\n<|grounding|>OCR this image.这个提示词,一个是无限复读,其次坐标不准,目前的表现是弱于之前的deepseek-ocr的

模型对提示词很敏感 要找到适合的那个。无限复读应该是没用对提示词。我的办法就是都试一遍🤣

这个是官方提示词,包括我选择你默认的模式,也出现了异常

Can i get your config? What platform are u using to deploy?

I am using vllm for inference and deployed on Modal (because I have credits there). I basically wrapped the official vllm script in a fast api.

i currently deploying using vllm to kubernested, but i got this error everytime:
vllm-server-1 | (APIServer pid=1) Value error, Model architectures ['DeepseekOCR2ForCausalLM'] are not supported.

you know how to fix that?

Can i get your config? What platform are u using to deploy?

I am using vllm for inference and deployed on Modal (because I have credits there). I basically wrapped the official vllm script in a fast api.

i currently deploying using vllm to kubernested, but i got this error everytime:
vllm-server-1 | (APIServer pid=1) Value error, Model architectures ['DeepseekOCR2ForCausalLM'] are not supported.

you know how to fix that?

You need to use the offline inference (github link) if you want to run it now.

vllm server support isn't implemented yet because vllm team needs to make changes in their repo so this usually doesn't happen on day 1. But it will be soon! The OG Deepseek OCR took ~3 days for upstream vllm support iirc.

这个是官方提示词,包括我选择你默认的模式,也出现了异常

对的,我发现还是有配置问题,又修复了一下。现在版本的准确率应该和官方的对其了。我还分析了一下prompt和不同文件类型的搭配,可以看一下:

Screenshot 2026-01-27 at 11.11.41 PM

还是发布在原地址:https://deepseek-ocr-v2-demo.vercel.app

Screenshot from 2026-01-19 16-57-55
I ocr this document, but the model can't read the footnote. Should i change the prompt or make the image easier to read?

Love the demo!

Took me a minute but I finally got comparable accuracy running on GCP. Tried to use IBM it’s far smaller than this one and it shows in the output for complex document layouts.

=====================

Edit Fri, 06 Feb 2026 19:50 UTC +1

Looks like the demo stopped working

Yoo I'm new to this stuff

Your demo is really awesome. Could you please share the source code that can be deployed using the Modal architecture?

how can i choose the model config like gundam or base, tiny,.. thanks.

你好,你的demo太好用了,点赞👍,我想问一下输入一张图,目标是找到图片中的文字和bbox,你的demo准确率都好高,我用的官网github的代码,准确率却很低。可以问一下你用的prompt吗,就是Other Image (w/ grounding)的prompt。万分感谢

Sign up or log in to comment