shixuanleong
/

visualheist-base

Image-Text-to-Text

Model card Files Files and versions

yifeihu commited on Jul 10, 2024

Commit

26b1a96

·

verified ·

1 Parent(s): 26b1b3e

Update README.md

Files changed (1) hide show

README.md +20 -0

README.md CHANGED Viewed

@@ -26,9 +26,29 @@ TF-ID models take an image of a single paper page as the input, and return bound
 TF-ID-base and TF-ID-large draw bounding boxes around tables/figures and their caption text.
 TF-ID-base-no-caption and TF-ID-large-no-caption draw bounding boxes around tables/figures without their caption text.
 Object Detection results format:
 {'\<OD>': {'bboxes': [[x1, y1, x2, y2], ...],
 'labels': ['label1', 'label2', ...]} }
 ## How to Get Started with the Model

 TF-ID-base and TF-ID-large draw bounding boxes around tables/figures and their caption text.
 TF-ID-base-no-caption and TF-ID-large-no-caption draw bounding boxes around tables/figures without their caption text.
+![image/png](https://huggingface.co/yifeihu/TF-ID-base/resolve/main/td-id-caption.png)
 Object Detection results format:
 {'\<OD>': {'bboxes': [[x1, y1, x2, y2], ...],
 'labels': ['label1', 'label2', ...]} }
+## Benchmarks
+We tested the models on paper pages outside the training dataset. The papers are a subset of huggingface daily paper.
+Correct output - the model draws correct bounding boxes for every table/figure in the given page.
+| Model                                                         | Total Images | Correct Output | Success Rate |
+|---------------------------------------------------------------|--------------|----------------|--------------|
+| TF-ID-base[[HF]](https://huggingface.co/yifeihu/TF-ID-base)   | 258          | 251            | 97.29%       |
+| TF-ID-large[[HF]](https://huggingface.co/yifeihu/TF-ID-large) | 258          | 253            | 98.06%       |
+| Model                                                         | Total Images | Correct Output | Success Rate |
+|---------------------------------------------------------------|--------------|----------------|--------------|
+| TF-ID-base-no-caption[[HF]](https://huggingface.co/yifeihu/TF-ID-base-no-caption)   | 261          | 253            | 96.93%       |
+| TF-ID-large-no-caption[[HF]](https://huggingface.co/yifeihu/TF-ID-large-no-caption) | 261          | 254            | 97.32%       |
+Depending on the use cases, some "incorrect" output could be totally usable. For example, the model draw two bounding boxes for one figure with two child components.
 ## How to Get Started with the Model