pavan-synkrato360 commited on
Commit
3f94e68
·
verified ·
1 Parent(s): 435e699

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +164 -3
README.md CHANGED
@@ -1,3 +1,164 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+
4
+ tags:
5
+ - onnx
6
+ - document-understanding
7
+ - layout-detection
8
+ - table-detection
9
+ - faria
10
+
11
+ pipeline_tag: object-detection
12
+ ---
13
+
14
+ # Faria ONNX Models
15
+
16
+ Pre-exported ONNX models used by [Faria](https://github.com/exto360-inc/faria), a document processing library with ML-powered
17
+ layout detection and table extraction. These files are ready for direct use with ONNX Runtime — no Python or conversion step
18
+ required.
19
+
20
+ ## Models
21
+
22
+ ### `detr_layout_detection.onnx` (~350 MB)
23
+
24
+ Document layout detection. Identifies structural elements across a page.
25
+
26
+ - **Source:** [`cmarkea/detr-layout-detection`](https://huggingface.co/cmarkea/detr-layout-detection)
27
+ - **ONNX opset:** 14
28
+
29
+ **Input**
30
+
31
+ | Name | Shape | Type |
32
+ |----------------|------------------------|---------|
33
+ | `pixel_values` | `[batch, 3, 800, 800]` | float32 |
34
+
35
+ **Outputs**
36
+
37
+ | Name | Shape | Type | Description |
38
+ |--------------|--------------------|---------|--------------------------------------------|
39
+ | `logits` | `[batch, 100, 12]` | float32 | Class scores (11 classes + no-object) |
40
+ | `pred_boxes` | `[batch, 100, 4]` | float32 | Normalized boxes `(cx, cy, w, h)` |
41
+
42
+ **Class labels (DocLayNet)**
43
+
44
+ | Index | Label |
45
+ |-------|----------------|
46
+ | 0 | Caption |
47
+ | 1 | Footnote |
48
+ | 2 | Formula |
49
+ | 3 | List-item |
50
+ | 4 | Page-footer |
51
+ | 5 | Page-header |
52
+ | 6 | Picture |
53
+ | 7 | Section-header |
54
+ | 8 | Table |
55
+ | 9 | Text |
56
+ | 10 | Title |
57
+ | 11 | (no object) |
58
+
59
+ **Post-processing**
60
+ 1. Apply softmax to `logits`
61
+ 2. Filter by confidence threshold
62
+ 3. Convert `(cx, cy, w, h)` → `(x1, y1, x2, y2)`
63
+ 4. Scale boxes to image size
64
+
65
+ ---
66
+
67
+ ### `nemotron_table_structure.onnx` (~200 MB)
68
+
69
+ Table structure recognition.
70
+
71
+ - **Source:** [`nvidia/nemotron-table-structure-v1`](https://huggingface.co/nvidia/nemotron-table-structure-v1)
72
+ - **ONNX opset:** 18
73
+
74
+ **Inputs**
75
+
76
+ | Name | Shape | Type | Description |
77
+ |--------------|---------------------|---------|----------------------------------|
78
+ | `input` | `[1, 3, 1024, 1024]`| float32 | RGB image |
79
+ | `orig_sizes` | `[1, 2]` | int64 | `[height, width]` |
80
+
81
+ **Outputs**
82
+
83
+ | Name | Shape | Type |
84
+ |---------|----------|---------|
85
+ | labels | `[N]` | float32 |
86
+ | boxes | `[N, 4]` | float32 |
87
+ | scores | `[N]` | float32 |
88
+
89
+ **Class labels**
90
+
91
+ | Index | Label |
92
+ |-------|--------|
93
+ | 1 | cell |
94
+ | 2 | row |
95
+ | 3 | column |
96
+ | 4 | header |
97
+
98
+ ---
99
+
100
+ ## Installation
101
+
102
+ ```bash
103
+ curl -fsSL https://raw.githubusercontent.com/exto360-inc/faria-install/main/install.sh | bash -s -- --features idp
104
+ ```
105
+
106
+ Or download manually:
107
+
108
+ ```bash
109
+ # Layout detection
110
+ curl -fsSL https://huggingface.co/pavan-synkrato360/faria-models/resolve/main/detr_layout_detection.onnx -o detr_layout_detection.onnx
111
+
112
+ # Table structure
113
+ curl -fsSL https://huggingface.co/pavan-synkrato360/faria-models/resolve/main/nemotron_table_structure.onnx -o nemotron_table_structure.onnx
114
+ ```
115
+
116
+ ---
117
+
118
+ ## config.json
119
+
120
+ ```json
121
+ {
122
+ "models": {
123
+ "detr_layout_detection": {
124
+ "filename": "detr_layout_detection.onnx",
125
+ "task": "document-layout-detection",
126
+ "source": "cmarkea/detr-layout-detection",
127
+ "onnx_opset": 14,
128
+ "input": {
129
+ "pixel_values": [1, 3, 800, 800]
130
+ },
131
+ "outputs": {
132
+ "logits": [1, 100, 12],
133
+ "pred_boxes": [1, 100, 4]
134
+ },
135
+ "classes": [
136
+ "Caption", "Footnote", "Formula", "List-item",
137
+ "Page-footer", "Page-header", "Picture", "Section-header",
138
+ "Table", "Text", "Title"
139
+ ]
140
+ },
141
+ "nemotron_table_structure": {
142
+ "filename": "nemotron_table_structure.onnx",
143
+ "task": "table-structure-recognition",
144
+ "source": "nvidia/nemotron-table-structure-v1",
145
+ "onnx_opset": 18,
146
+ "inputs": {
147
+ "input": [1, 3, 1024, 1024],
148
+ "orig_sizes": [1, 2]
149
+ },
150
+ "outputs": {
151
+ "labels": ["N"],
152
+ "boxes": ["N", 4],
153
+ "scores": ["N"]
154
+ },
155
+ "classes": {
156
+ "1": "cell",
157
+ "2": "row",
158
+ "3": "column",
159
+ "4": "header"
160
+ }
161
+ }
162
+ }
163
+ }
164
+ ```