--- license: mit language: - en metrics: - accuracy - bleu pipeline_tag: table-question-answering tags: - code --- # TableDART Gating Network Checkpoint This repository provides the trained gating network checkpoint for **TableDART: Dynamic Adaptive Multi-Modal Routing for Table Understanding**. TableDART is a training-efficient framework that dynamically routes each table-query pair through the most appropriate reasoning path: Text-only, Image-only, or Fusion, while keeping all pretrained expert models **frozen**. --- ## 🔍 Overview Modeling semantic and structural information from tabular data remains a core challenge for effective table understanding. Existing LLM-based approaches face several limitations: - Table-as-Text methods flatten tables into text sequences, losing structural cues. - Table-as-Image methods preserve layout but struggle with precise semantics. - Static multimodal methods process all modalities for every query, introducing redundancy and potential cross-modal conflicts. - Most approaches require expensive fine-tuning of large LLMs or multimodal models. **Our Solution: TableDART** addresses these limitations through: - Reusing pretrained single-modality expert models (kept frozen, plug-and-play) - Learning only a lightweight 2.59M-parameter MLP gating network - Dynamically selecting the optimal path for each table-query pair (instance-level) - Introducing an LLM agent that mediates cross-modal knowledge integration when needed This design avoids full LLM/MLLM fine-tuning, reduces computational redundancy, and maintains strong efficiency-performance trade-offs. --- ## 🚀 Performance Across 7 benchmarks, TableDART: - Achieves state-of-the-art results on 4/7 benchmarks among open-source models - Outperforms the strongest baseline by +4.02% accuracy on average - Maintains significant computational efficiency gains ## 📦 What This Checkpoint Contains This Hugging Face model includes: - The trained MLP gating network checkpoint ⚠️ Note: This checkpoint does not include the pretrained text or image expert models. Please load those separately according to the official repository instructions. --- ## 🛠 Code and Usage Full training scripts, inference pipelines, and reproduction details are available at our Github Repository: https://github.com/xiaobo-xing/TableDART --- ## 📄 Paper ICLR 2026 OpenReview Version: https://openreview.net/forum?id=4aZTiLH3fm ArXiv Version: https://arxiv.org/abs/2509.14671 --- ## 📚 Citation If you find TableDART helpful, please cite our paper and consider starring the repository. ### ICLR 2026 Version ```bibtex @inproceedings{xing2026tabledart, title={Table{DART}: Dynamic Adaptive Multi-Modal Routing for Table Understanding}, author={Xiaobo Xing and Wei Yuan and Tong Chen and Quoc Viet Hung Nguyen and Xiangliang Zhang and Hongzhi Yin}, booktitle={The Fourteenth International Conference on Learning Representations}, year={2026}, url={https://openreview.net/forum?id=4aZTiLH3fm} } ``` ### ArXiv Version ```bibtex @misc{xing2025tabledartdynamicadaptivemultimodal, title={TableDART: Dynamic Adaptive Multi-Modal Routing for Table Understanding}, author={Xiaobo Xing and Wei Yuan and Tong Chen and Quoc Viet Hung Nguyen and Xiangliang Zhang and Hongzhi Yin}, year={2025}, eprint={2509.14671}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2509.14671} } ```