letxbe
/

DocExplainer

Visual Question Answering

Visual-Question-Answering

Question-Answering

Model card Files Files and versions

AlessioChenn commited on Aug 29, 2025

Commit

f24e585

·

verified ·

1 Parent(s): bd7b907

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ license: apache-2.0
 <div align="center">
-<h1>DocExplainer: Visual Document QA with Bounding Box Localization</h1>
 [![License: CC BY 4.0](https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by/4.0/)
 <!-- [![arXiv](https://img.shields.io/badge/arXiv-2501.03403-b31b1b.svg)]() -->
@@ -24,7 +24,7 @@ license: apache-2.0
 ## Model description
-DocExplainer is a an approach to Visual Document Question Answering (Document VQA) with bounding box localization.
 Unlike standard VLMs that only provide text-based answers, DocExplainer adds **visual evidence through bounding boxes**, making model predictions more interpretable.
 It is designed as a **plug-and-play module** to be combined with existing Vision-Language Models (VLMs), decoupling answer generation from spatial grounding.

 <div align="center">
+<h1>DocExplainer: Document VQA with Bounding Box Localization</h1>
 [![License: CC BY 4.0](https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by/4.0/)
 <!-- [![arXiv](https://img.shields.io/badge/arXiv-2501.03403-b31b1b.svg)]() -->
 ## Model description
+DocExplainer is a an approach to Document Visual Question Answering (Document VQA) with bounding box localization.
 Unlike standard VLMs that only provide text-based answers, DocExplainer adds **visual evidence through bounding boxes**, making model predictions more interpretable.
 It is designed as a **plug-and-play module** to be combined with existing Vision-Language Models (VLMs), decoupling answer generation from spatial grounding.