Update README.md
Browse files
README.md
CHANGED
|
@@ -240,7 +240,7 @@ Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.
|
|
| 240 |
| Intended Task/Domain: | Document Understanding |
|
| 241 |
| Model Type: | YOLOX Object Detection for Charts, Tables, Infographics, Header/footers, Texts, and Titles |
|
| 242 |
| Intended User: | Enterprise developers, data scientists, and other technical users who need to extract structural elements from documents. |
|
| 243 |
-
| Output: |
|
| 244 |
| Describe how the model works: | The model identifies objects in an image by first dividing the image into a grid. For each grid cell, it extracts visual features and simultaneously predicts which objects are present (for example, 'chart' or 'table') and where they are located in that cell, all in a single pass through the image. |
|
| 245 |
| Name the adversely impacted groups this has been tested to deliver comparable outcomes regardless of: | Not Applicable |
|
| 246 |
| Technical Limitations & Mitigation: | The model may not generalize to unknown document types/formats not commonly found on the web. Further fine-tuning might be required for such documents. |
|
|
|
|
| 240 |
| Intended Task/Domain: | Document Understanding |
|
| 241 |
| Model Type: | YOLOX Object Detection for Charts, Tables, Infographics, Header/footers, Texts, and Titles |
|
| 242 |
| Intended User: | Enterprise developers, data scientists, and other technical users who need to extract structural elements from documents. |
|
| 243 |
+
| Output: | After post-processing, the output is three numpy array that contains the detections: `boxes [N x 4]` (format is normalized `(x_min, y_min, x_max, y_max)`), associated classes: `labels [N]` and confidence scores: `scores [N]`.|
|
| 244 |
| Describe how the model works: | The model identifies objects in an image by first dividing the image into a grid. For each grid cell, it extracts visual features and simultaneously predicts which objects are present (for example, 'chart' or 'table') and where they are located in that cell, all in a single pass through the image. |
|
| 245 |
| Name the adversely impacted groups this has been tested to deliver comparable outcomes regardless of: | Not Applicable |
|
| 246 |
| Technical Limitations & Mitigation: | The model may not generalize to unknown document types/formats not commonly found on the web. Further fine-tuning might be required for such documents. |
|