Add link to paper

#42
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -50,6 +50,7 @@ dots.ocr: Multilingual Document Layout Parsing in a Single Vision-Language Model
50
  3. **Unified and Simple Architecture:** By leveraging a single vision-language model, **dots.ocr** offers a significantly more streamlined architecture than conventional methods that rely on complex, multi-model pipelines. Switching between tasks is accomplished simply by altering the input prompt, proving that a VLM can achieve competitive detection results compared to traditional detection models like DocLayout-YOLO.
51
  4. **Efficient and Fast Performance:** Built upon a compact 1.7B LLM, **dots.ocr** provides faster inference speeds than many other high-performing models based on larger foundations.
52
 
 
53
 
54
  ## Usage with transformers
55
 
 
50
  3. **Unified and Simple Architecture:** By leveraging a single vision-language model, **dots.ocr** offers a significantly more streamlined architecture than conventional methods that rely on complex, multi-model pipelines. Switching between tasks is accomplished simply by altering the input prompt, proving that a VLM can achieve competitive detection results compared to traditional detection models like DocLayout-YOLO.
51
  4. **Efficient and Fast Performance:** Built upon a compact 1.7B LLM, **dots.ocr** provides faster inference speeds than many other high-performing models based on larger foundations.
52
 
53
+ It was introduced in the paper [dots.ocr: Multilingual Document Layout Parsing in a Single Vision-Language Model]((https://huggingface.co/papers/2512.02498)).
54
 
55
  ## Usage with transformers
56