pavan-synkrato360 commited on
Commit
435e699
·
verified ·
1 Parent(s): e098716

Upload 4 files

Browse files
detr_layout_detection.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:39f3a049976cbb99a54b12cf193bbdfb73cb58a1388f7893a573007029dbb533
3
+ size 166762086
detr_layout_detection_without_data.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:39f3a049976cbb99a54b12cf193bbdfb73cb58a1388f7893a573007029dbb533
3
+ size 166762086
nemotron_table_structure.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d5275e17faa4204c875505e8a89559937113228b04764ac442b04237471e7a82
3
+ size 218580866
readme.md ADDED
@@ -0,0 +1,166 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ tags:
3
+ - onnx
4
+ - document-understanding
5
+ - layout-detection
6
+ - table-detection
7
+ - faria
8
+ pipeline_tag: object-detection
9
+ ---
10
+
11
+ # Faria ONNX Models
12
+
13
+ Pre-exported ONNX models used by [Faria](https://github.com/exto360-inc/faria), a document processing library with ML-powered
14
+ layout detection and table extraction. These files are ready for direct use with ONNX Runtime — no Python or conversion step
15
+ required.
16
+
17
+ ## Models
18
+
19
+ ### `detr_layout_detection.onnx` (~350 MB)
20
+
21
+ Document layout detection. Identifies structural elements across a page.
22
+
23
+ - **Source:** [`cmarkea/detr-layout-detection`](https://huggingface.co/cmarkea/detr-layout-detection)
24
+ - **ONNX opset:** 14
25
+
26
+ **Input**
27
+
28
+ | Name | Shape | Type |
29
+ |------|-------|------|
30
+ | `pixel_values` | `[batch, 3, 800, 800]` | float32 |
31
+
32
+ **Outputs**
33
+
34
+ | Name | Shape | Type | Description |
35
+ |------|-------|------|-------------|
36
+ | `logits` | `[batch, 100, 12]` | float32 | Class scores (11 classes + no-object) |
37
+ | `pred_boxes` | `[batch, 100, 4]` | float32 | Normalized boxes in `(cx, cy, w, h)` format |
38
+
39
+ **Class labels (DocLayNet)**
40
+
41
+ | Index | Label |
42
+ |-------|-------|
43
+ | 0 | Caption |
44
+ | 1 | Footnote |
45
+ | 2 | Formula |
46
+ | 3 | List-item |
47
+ | 4 | Page-footer |
48
+ | 5 | Page-header |
49
+ | 6 | Picture |
50
+ | 7 | Section-header |
51
+ | 8 | Table |
52
+ | 9 | Text |
53
+ | 10 | Title |
54
+ | 11 | (no object) |
55
+
56
+ **Post-processing**
57
+ 1. Apply softmax to `logits` to get class probabilities
58
+ 2. Filter detections by confidence threshold
59
+ 3. Convert boxes from `(cx, cy, w, h)` to `(x1, y1, x2, y2)`
60
+ 4. Scale boxes from `[0, 1]` to image dimensions
61
+
62
+ ---
63
+
64
+ ### `nemotron_table_structure.onnx` (~200 MB)
65
+
66
+ Table structure recognition. Detects cells, rows, columns, and headers within a detected table region.
67
+
68
+ - **Source:** [`nvidia/nemotron-table-structure-v1`](https://huggingface.co/nvidia/nemotron-table-structure-v1)
69
+ - **ONNX opset:** 18
70
+
71
+ **Inputs**
72
+
73
+ | Name | Shape | Type | Description |
74
+ |------|-------|------|-------------|
75
+ | `input` | `[1, 3, 1024, 1024]` | float32 | RGB image, normalized |
76
+ | `orig_sizes` | `[1, 2]` | int64 | Original image `[height, width]` |
77
+
78
+ **Outputs**
79
+
80
+ | Name | Shape | Type | Description |
81
+ |------|-------|------|-------------|
82
+ | `labels` | `[N]` | float32 | Class label per detection |
83
+ | `boxes` | `[N, 4]` | float32 | Normalized boxes `[x1, y1, x2, y2]` |
84
+ | `scores` | `[N]` | float32 | Confidence score per detection |
85
+
86
+ **Class labels**
87
+
88
+ | Index | Label |
89
+ |-------|-------|
90
+ | 1 | cell |
91
+ | 2 | row |
92
+ | 3 | column |
93
+ | 4 | header |
94
+
95
+ ---
96
+
97
+ ## Installation
98
+
99
+ These models are installed automatically by the [faria-install](https://github.com/exto360-inc/faria-install) script:
100
+
101
+ ```bash
102
+ curl -fsSL https://raw.githubusercontent.com/exto360-inc/faria-install/main/install.sh | bash -s -- --features idp
103
+
104
+ Or download directly:
105
+
106
+ # Layout detection
107
+ curl -fsSL https://huggingface.co/pavan-synkrato360/faria-models/resolve/main/detr_layout_detection.onnx \
108
+ -o detr_layout_detection.onnx
109
+
110
+ # Table structure
111
+ curl -fsSL https://huggingface.co/pavan-synkrato360/faria-models/resolve/main/nemotron_table_structure.onnx \
112
+ -o nemotron_table_structure.onnx
113
+
114
+ Export
115
+
116
+ These are custom ONNX exports from their respective source models. The export scripts are in the faria-install repository under
117
+ models/ if you need to re-export.
118
+
119
+ ---
120
+
121
+ **`config.json`**
122
+
123
+ ```json
124
+ {
125
+ "models": {
126
+ "detr_layout_detection": {
127
+ "filename": "detr_layout_detection.onnx",
128
+ "task": "document-layout-detection",
129
+ "source": "cmarkea/detr-layout-detection",
130
+ "onnx_opset": 14,
131
+ "input": {
132
+ "pixel_values": [1, 3, 800, 800]
133
+ },
134
+ "outputs": {
135
+ "logits": [1, 100, 12],
136
+ "pred_boxes": [1, 100, 4]
137
+ },
138
+ "classes": [
139
+ "Caption", "Footnote", "Formula", "List-item",
140
+ "Page-footer", "Page-header", "Picture", "Section-header",
141
+ "Table", "Text", "Title"
142
+ ]
143
+ },
144
+ "nemotron_table_structure": {
145
+ "filename": "nemotron_table_structure.onnx",
146
+ "task": "table-structure-recognition",
147
+ "source": "nvidia/nemotron-table-structure-v1",
148
+ "onnx_opset": 18,
149
+ "inputs": {
150
+ "input": [1, 3, 1024, 1024],
151
+ "orig_sizes": [1, 2]
152
+ },
153
+ "outputs": {
154
+ "labels": ["N"],
155
+ "boxes": ["N", 4],
156
+ "scores": ["N"]
157
+ },
158
+ "classes": {
159
+ "1": "cell",
160
+ "2": "row",
161
+ "3": "column",
162
+ "4": "header"
163
+ }
164
+ }
165
+ }
166
+ }