letxbe
/

DocExplainer

Visual Question Answering

Visual-Question-Answering

Question-Answering

Model card Files Files and versions

AlessioChenn commited on Aug 29, 2025

Commit

62e7389

·

verified ·

1 Parent(s): a0fa032

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -24,7 +24,7 @@ license: apache-2.0
 ## Model description
-DocExplainerV0 is a **first-step approach** to Visual Document Question Answering (VQA) with bounding box localization.
 Unlike standard VLMs that only provide text-based answers, DocExplainerV0 adds **visual evidence through bounding boxes**, making model predictions more interpretable.
 It is designed as a **plug-and-play module** to be combined with existing Vision-Language Models (VLMs), decoupling answer generation from spatial grounding.

 ## Model description
+DocExplainerV0 is a **first-step approach** to Visual Document Question Answering with bounding box localization.
 Unlike standard VLMs that only provide text-based answers, DocExplainerV0 adds **visual evidence through bounding boxes**, making model predictions more interpretable.
 It is designed as a **plug-and-play module** to be combined with existing Vision-Language Models (VLMs), decoupling answer generation from spatial grounding.