Commit History

fix: remove GPU-dependent pre-download, use restructure_pages for cross-page tables, robust md extraction
0111393

sidoutcome commited on

fix: use CPU mode for model pre-download during Docker build
e8991b2

sidoutcome commited on

feat: v5.0.0 PaddleOCR-VL-1.5 + Gemini hybrid architecture
16b2195

sidoutcome commited on

feat: v4.0.0 β€” VLM + Gemini 3 Flash hybrid (table pages use Gemini API)
ba23da1

sidoutcome commited on

feat: v3.3.1 - disable table re-prompting, add page number cleanup
c8c1790

sidoutcome commited on

feat: v3.3.0 - table re-prompting, heading normalization, footer cleanup
a0faf3e

sidoutcome commited on

feat: v3.3.0 - increase max_tokens to 32768 for wide tables
a2561ab

sidoutcome commited on

feat: v3.3.0 - heading normalization, footer cleanup, table fixes
2b053ce

sidoutcome commited on

feat: v3.3.0 - DPI 200, post-processing, cross-page dedup
1cca2ec

sidoutcome commited on

fix: total_mem β†’ total_memory attribute fix for startup
253d98a

sidoutcome commited on

feat: v3.2.1 - remove page markers, fix escaped quotes
e54472d

sidoutcome commited on

feat: v3.2.0 - LaTeX→MD conversion, VLM output cleanup, improved prompt, disable thinking
031c76c

sidoutcome commited on

feat: v3.1.0 - DPI 150, parallel rendering, VLM retry, quality fixes
53b94dc

sidoutcome commited on

feat: v3.0.0 VLM-first hybrid architecture β€” GPU VLM on all pages, Docling TableFormer only on table pages
c67903b

sidoutcome commited on

fix: reduce concurrent VLM workers to 2 to prevent GPU OOM on 30B model
3f46c5e

sidoutcome commited on

perf: concurrent VLM OCR β€” process pages in parallel via ThreadPoolExecutor
79cc114

sidoutcome commited on

fix: resolve /parse/url for URLs without file extensions (e.g. arxiv)
8832428

sidoutcome commited on

feat: increase VLM max_tokens to 16384
b25fd10

sidoutcome commited on

fix: increase max-model-len to 65536 for VLM image tokens, improve error logging
9385fa0

sidoutcome commited on

fix: reduce max_tokens to 4096, remove invalid skip_special_tokens, add error body logging
7f8ad4a

sidoutcome commited on

fix: add onnxruntime for Docling RapidOCR, fix total_memory attr
13b019d

sidoutcome commited on

fix: total_mem -> total_memory, clean up debug CMD and start.sh
dead0a0

sidoutcome commited on

debug: use python3 CMD to test if output reaches HF logs
7162cb2

sidoutcome commited on

debug: inline bash -c to test if CMD even executes
ea34895

sidoutcome commited on

fix: use full path /bin/bash entrypoint to ensure CMD executes
c6f72c0

sidoutcome commited on

debug: add aggressive startup logging to diagnose silent hang
cc32681

sidoutcome commited on

fix: remove sed pipe that caused block buffering and wrong PID tracking
28a7df5

sidoutcome commited on

fix: use HF repo ID for vLLM model loading instead of local path
6bf403a

sidoutcome commited on

fix: use JSON format for --limit-mm-per-prompt (vLLM compat)
2a4a2f0

sidoutcome commited on

fix: use python3 in start.sh (vLLM base image)
a3f50ac

sidoutcome commited on

feat: upgrade to Qwen3-VL-30B-A3B, simplify auth, fix redirects
922ba62

sidoutcome commited on

fix: remove CLAUDE.md from HF Space (internal dev file)
b0fbe7c

sidoutcome commited on

docs: fix deploy instructions for monorepo, remove numpy dep
8ac7242

sidoutcome commited on

feat: hybrid VLM parser with Qwen3-VL-8B via vLLM (v2.0.0)
8c4351b

sidoutcome commited on

feat: support both API_TOKEN and API_DEV_TOKEN
4848ba0

sidoutcome commited on

fix: single-line Python command for HF Docker builder
6c6dbe0

sidoutcome commited on

Initial commit: Docling Parser API
43d6acf

sidoutcome commited on

Initial commit: Docling Parser API
5052def

sidoutcome commited on