TheoViel commited on
Commit
3eef38a
·
verified ·
1 Parent(s): d45b8b3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -240,7 +240,7 @@ Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.
240
  | Intended Task/Domain: | Document Understanding |
241
  | Model Type: | YOLOX Object Detection for Charts, Tables, Infographics, Header/footers, Texts, and Titles |
242
  | Intended User: | Enterprise developers, data scientists, and other technical users who need to extract structural elements from documents. |
243
- | Output: | A List of dictionaries containing lists of dictionaries of floating point numbers (representing bounding box information). <br> **Example**: `{"data": [{"index": 0,"bounding_boxes": {"table": [{"x_min": 0.6503,"y_min": 0.2161,"x_max": 0.7835,"y_max": 0.3236,"confidence": 0.9306}]}}]}` |
244
  | Describe how the model works: | The model identifies objects in an image by first dividing the image into a grid. For each grid cell, it extracts visual features and simultaneously predicts which objects are present (for example, 'chart' or 'table') and where they are located in that cell, all in a single pass through the image. |
245
  | Name the adversely impacted groups this has been tested to deliver comparable outcomes regardless of: | Not Applicable |
246
  | Technical Limitations & Mitigation: | The model may not generalize to unknown document types/formats not commonly found on the web. Further fine-tuning might be required for such documents. |
 
240
  | Intended Task/Domain: | Document Understanding |
241
  | Model Type: | YOLOX Object Detection for Charts, Tables, Infographics, Header/footers, Texts, and Titles |
242
  | Intended User: | Enterprise developers, data scientists, and other technical users who need to extract structural elements from documents. |
243
+ | Output: | After post-processing, the output is three numpy array that contains the detections: `boxes [N x 4]` (format is normalized `(x_min, y_min, x_max, y_max)`), associated classes: `labels [N]` and confidence scores: `scores [N]`.|
244
  | Describe how the model works: | The model identifies objects in an image by first dividing the image into a grid. For each grid cell, it extracts visual features and simultaneously predicts which objects are present (for example, 'chart' or 'table') and where they are located in that cell, all in a single pass through the image. |
245
  | Name the adversely impacted groups this has been tested to deliver comparable outcomes regardless of: | Not Applicable |
246
  | Technical Limitations & Mitigation: | The model may not generalize to unknown document types/formats not commonly found on the web. Further fine-tuning might be required for such documents. |