anthonym21 commited on
Commit
cdc285f
·
verified ·
1 Parent(s): 934a7f2

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +41 -0
README.md ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ base_model: rednote-hilab/dots.ocr
4
+ tags:
5
+ - gguf
6
+ - ocr
7
+ - llama-cpp
8
+ - vision
9
+ - image-to-text
10
+ language:
11
+ - en
12
+ - zh
13
+ - multilingual
14
+ ---
15
+
16
+ # dots.ocr GGUF
17
+
18
+ GGUF conversions of [rednote-hilab/dots.ocr](https://huggingface.co/rednote-hilab/dots.ocr) for use with [llama.cpp](https://github.com/ggml-org/llama.cpp).
19
+
20
+ ## Files
21
+
22
+ | File | Size | Description |
23
+ |---|---|---|
24
+ | Dots.Ocr-1.8B-Q8_0.gguf | 1.8 GB | Text model, 8-bit quantized |
25
+ | Dots.Ocr-1.8B-F16.gguf | 3.4 GB | Text model, float16 |
26
+ | mmproj-Dots.Ocr-F16.gguf | 2.4 GB | Vision encoder (mmproj), float16 |
27
+
28
+ ## Architecture
29
+
30
+ dots.ocr = Qwen2 text backbone (1.7B params, 28 layers) + modified Qwen2-VL vision encoder (1.2B params, 42 layers).
31
+
32
+ Key differences from Qwen2-VL:
33
+ - Text model is standard Qwen2 with 1D RoPE (not M-RoPE)
34
+ - Vision uses RMSNorm, SiLU gated MLP, Conv2D patches, no attention bias
35
+ - 2D M-RoPE internal to vision encoder only
36
+
37
+ ## Usage with llama.cpp
38
+
39
+
40
+
41
+ > **Note:** Requires llama.cpp with dots.ocr support (pending upstream merge).