XiaoboX commited on
Commit
a2249ea
ยท
verified ยท
1 Parent(s): 32f9d5b

Update README

Browse files
Files changed (1) hide show
  1. README.md +106 -3
README.md CHANGED
@@ -1,3 +1,106 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ metrics:
6
+ - accuracy
7
+ - bleu
8
+ base_model:
9
+ - AIDC-AI/Ovis2-8B
10
+ - tablegpt/TableGPT2-7B
11
+ pipeline_tag: table-question-answering
12
+ tags:
13
+ - code
14
+ ---
15
+ # TableDART Gating Network Checkpoint
16
+
17
+ This repository provides the trained gating network checkpoint for **TableDART: Dynamic Adaptive Multi-Modal Routing for Table Understanding**.
18
+
19
+ TableDART is a training-efficient framework that dynamically routes each table-query pair through the most appropriate reasoning path: Text-only, Image-only, or Fusion, while keeping all pretrained expert models **frozen**.
20
+
21
+ ---
22
+
23
+ ## ๐Ÿ” Overview
24
+
25
+ Modeling semantic and structural information from tabular data remains a core challenge for effective table understanding.
26
+ Existing LLM-based approaches face several limitations:
27
+
28
+ - Table-as-Text methods flatten tables into text sequences, losing structural cues.
29
+ - Table-as-Image methods preserve layout but struggle with precise semantics.
30
+ - Static multimodal methods process all modalities for every query, introducing redundancy and potential cross-modal conflicts.
31
+ - Most approaches require expensive fine-tuning of large LLMs or multimodal models.
32
+
33
+ **Our Solution: TableDART** addresses these limitations through:
34
+
35
+ - Reusing pretrained single-modality expert models (kept frozen, plug-and-play)
36
+ - Learning only a lightweight 2.59M-parameter MLP gating network
37
+ - Dynamically selecting the optimal path for each table-query pair (instance-level)
38
+ - Introducing an LLM agent that mediates cross-modal knowledge integration when needed
39
+
40
+ This design avoids full LLM/MLLM fine-tuning, reduces computational redundancy, and maintains strong efficiency-performance trade-offs.
41
+
42
+ ---
43
+
44
+ ## ๐Ÿš€ Performance
45
+
46
+ Across 7 benchmarks, TableDART:
47
+
48
+ - Achieves state-of-the-art results on 4/7 benchmarks among open-source models
49
+ - Outperforms the strongest baseline by +4.02% accuracy on average
50
+ - Maintains significant computational efficiency gains
51
+
52
+
53
+ ## ๐Ÿ“ฆ What This Checkpoint Contains
54
+
55
+ This Hugging Face model includes:
56
+
57
+ - The trained MLP gating network checkpoint
58
+
59
+ โš ๏ธ Note: This checkpoint does not include the pretrained text or image expert models. Please load those separately according to the official repository instructions.
60
+
61
+ ---
62
+
63
+ ## ๐Ÿ›  Code and Usage
64
+
65
+ Full training scripts, inference pipelines, and reproduction details are available at our Github Repository: https://github.com/xiaobo-xing/TableDART
66
+
67
+ ---
68
+
69
+ ## ๐Ÿ“„ Paper
70
+
71
+ ICLR 2026 OpenReview Version:
72
+ https://openreview.net/forum?id=4aZTiLH3fm
73
+
74
+ ArXiv Version:
75
+ https://arxiv.org/abs/2509.14671
76
+
77
+ ---
78
+
79
+ ## ๐Ÿ“š Citation
80
+
81
+ If you find TableDART helpful, please cite our paper and consider starring the repository.
82
+
83
+ ### ICLR 2026 Version
84
+
85
+ ```bibtex
86
+ @inproceedings{xing2026tabledart,
87
+ title={Table{DART}: Dynamic Adaptive Multi-Modal Routing for Table Understanding},
88
+ author={Xiaobo Xing and Wei Yuan and Tong Chen and Quoc Viet Hung Nguyen and Xiangliang Zhang and Hongzhi Yin},
89
+ booktitle={The Fourteenth International Conference on Learning Representations},
90
+ year={2026},
91
+ url={https://openreview.net/forum?id=4aZTiLH3fm}
92
+ }
93
+ ```
94
+
95
+ ### ArXiv Version
96
+ ```bibtex
97
+ @misc{xing2025tabledartdynamicadaptivemultimodal,
98
+ title={TableDART: Dynamic Adaptive Multi-Modal Routing for Table Understanding},
99
+ author={Xiaobo Xing and Wei Yuan and Tong Chen and Quoc Viet Hung Nguyen and Xiangliang Zhang and Hongzhi Yin},
100
+ year={2025},
101
+ eprint={2509.14671},
102
+ archivePrefix={arXiv},
103
+ primaryClass={cs.CL},
104
+ url={https://arxiv.org/abs/2509.14671}
105
+ }
106
+ ```