docling-parser / app.py

Commit History

perf: concurrency improvements for high-volume Excel processing
33af535
Running

ibadrehman-outcome commited on

fix: Excel tables now output HTML matching Gemini PDF format
87afc64

ibadrehman-outcome commited on

feat: add Excel (.xlsx/.xlsm) parsing support via Docling
cf7950b

ibadrehman-outcome commited on

fix: update docling gemini parser
c28aa68

Ibad ur Rehman commited on

feat: update granite parser runtime
b5db7b1

Ibad ur Rehman commited on

feat: switch parser to granite docling
dde2973

Ibad ur Rehman commited on

feat: deploy docling first parser
74cacc0

Ibad ur Rehman commited on

feat: switch to unsloth gguf runtime
dd23733

Ibad ur Rehman commited on

perf: optimize qwen inference path
b586eeb

Ibad ur Rehman commited on

feat: switch parser to qwen vl
51c66dc

Ibad ur Rehman commited on

feat: simplify parser response flow
add910e

Ibad ur Rehman commited on

feat: expand parser diagnostics
852a43f

Ibad ur Rehman commited on

feat: expose page-level parse results
efed02b

Ibad ur Rehman commited on

feat: configure docling accelerator
5f188d9

Ibad ur Rehman commited on

feat: switch to docling first parser
4af0af0

Ibad ur Rehman commited on

feat: deploy paddleocr gemini parser
799f504

Ibad ur Rehman commited on

feat: v5.0.0 PaddleOCR-VL-1.5 + Gemini hybrid architecture
16b2195

sidoutcome commited on

feat: v4.0.0 — VLM + Gemini 3 Flash hybrid (table pages use Gemini API)
ba23da1

sidoutcome commited on

feat: v3.3.1 - disable table re-prompting, add page number cleanup
c8c1790

sidoutcome commited on

feat: v3.3.0 - table re-prompting, heading normalization, footer cleanup
a0faf3e

sidoutcome commited on

feat: v3.3.0 - increase max_tokens to 32768 for wide tables
a2561ab

sidoutcome commited on

feat: v3.3.0 - heading normalization, footer cleanup, table fixes
2b053ce

sidoutcome commited on

feat: v3.3.0 - DPI 200, post-processing, cross-page dedup
1cca2ec

sidoutcome commited on

fix: total_mem → total_memory attribute fix for startup
253d98a

sidoutcome commited on

feat: v3.2.1 - remove page markers, fix escaped quotes
e54472d

sidoutcome commited on

feat: v3.2.0 - LaTeX→MD conversion, VLM output cleanup, improved prompt, disable thinking
031c76c

sidoutcome commited on

feat: v3.1.0 - DPI 150, parallel rendering, VLM retry, quality fixes
53b94dc

sidoutcome commited on

feat: v3.0.0 VLM-first hybrid architecture — GPU VLM on all pages, Docling TableFormer only on table pages
c67903b

sidoutcome commited on

fix: reduce concurrent VLM workers to 2 to prevent GPU OOM on 30B model
3f46c5e

sidoutcome commited on

perf: concurrent VLM OCR — process pages in parallel via ThreadPoolExecutor
79cc114

sidoutcome commited on

fix: resolve /parse/url for URLs without file extensions (e.g. arxiv)
8832428

sidoutcome commited on

feat: increase VLM max_tokens to 16384
b25fd10

sidoutcome commited on

fix: increase max-model-len to 65536 for VLM image tokens, improve error logging
9385fa0

sidoutcome commited on

fix: reduce max_tokens to 4096, remove invalid skip_special_tokens, add error body logging
7f8ad4a

sidoutcome commited on

fix: total_mem -> total_memory, clean up debug CMD and start.sh
dead0a0

sidoutcome commited on

feat: upgrade to Qwen3-VL-30B-A3B, simplify auth, fix redirects
922ba62

sidoutcome commited on

feat: hybrid VLM parser with Qwen3-VL-8B via vLLM (v2.0.0)
8c4351b

sidoutcome commited on

feat: support both API_TOKEN and API_DEV_TOKEN
4848ba0

sidoutcome commited on

Initial commit: Docling Parser API
5052def

sidoutcome commited on