Update README.md
Browse files
README.md
CHANGED
|
@@ -14,7 +14,7 @@ license: apache-2.0
|
|
| 14 |
|
| 15 |
<div align="center">
|
| 16 |
|
| 17 |
-
<h1>DocExplainer:
|
| 18 |
|
| 19 |
[](https://creativecommons.org/licenses/by/4.0/)
|
| 20 |
<!-- []() -->
|
|
@@ -24,7 +24,7 @@ license: apache-2.0
|
|
| 24 |
|
| 25 |
## Model description
|
| 26 |
|
| 27 |
-
DocExplainer is a an approach to Visual
|
| 28 |
Unlike standard VLMs that only provide text-based answers, DocExplainer adds **visual evidence through bounding boxes**, making model predictions more interpretable.
|
| 29 |
It is designed as a **plug-and-play module** to be combined with existing Vision-Language Models (VLMs), decoupling answer generation from spatial grounding.
|
| 30 |
|
|
|
|
| 14 |
|
| 15 |
<div align="center">
|
| 16 |
|
| 17 |
+
<h1>DocExplainer: Document VQA with Bounding Box Localization</h1>
|
| 18 |
|
| 19 |
[](https://creativecommons.org/licenses/by/4.0/)
|
| 20 |
<!-- []() -->
|
|
|
|
| 24 |
|
| 25 |
## Model description
|
| 26 |
|
| 27 |
+
DocExplainer is a an approach to Document Visual Question Answering (Document VQA) with bounding box localization.
|
| 28 |
Unlike standard VLMs that only provide text-based answers, DocExplainer adds **visual evidence through bounding boxes**, making model predictions more interpretable.
|
| 29 |
It is designed as a **plug-and-play module** to be combined with existing Vision-Language Models (VLMs), decoupling answer generation from spatial grounding.
|
| 30 |
|