|
|
--- |
|
|
base_model: |
|
|
- llava-hf/llama3-llava-next-8b-hf |
|
|
- openbmb/MiniCPM-V-2_6 |
|
|
- microsoft/Phi-3-vision-128k-instruct |
|
|
- Qwen/Qwen2.5-VL-7B-Instruct |
|
|
license: mit |
|
|
metrics: |
|
|
- accuracy |
|
|
pipeline_tag: image-text-to-text |
|
|
library_name: transformers |
|
|
--- |
|
|
|
|
|
**The following models are obtained via supervised fine-tuning (SFT) using the ECD-10k-Images dataset ([URL](https://huggingface.co/datasets/ChartFoundation/ECD-10k-Images)) proposed in our ICCV 2025 paper, "[Effective Training Data Synthesis for Improving MLLM Chart Understanding](https://huggingface.co/papers/2508.06492)" ([Code](https://github.com/yuweiyang-anu/ECD)).** |
|
|
|
|
|
**ECD Dataset Overview**: |
|
|
 |
|
|
|
|
|
**Comparing 4 MLLMs on six test sets: (CharXiv, ChartQA, ReachQA, ChartBench, ChartX, ECDBench)** |
|
|
 |
|
|
|
|
|
**Citation**: |
|
|
|
|
|
If it is helpful to your research, please cite our paper as follows: |
|
|
|
|
|
``` |
|
|
@inproceedings{yang2025effective, |
|
|
title={Effective Training Data Synthesis for Improving MLLM Chart Understanding}, |
|
|
author={Yang, Yuwei and Zhang, Zeyu and Hou, Yunzhong and Li, Zhuowan and Liu, Gaowen and Payani, Ali and Ting, Yuan-Sen and Zheng, Liang}, |
|
|
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, |
|
|
year={2025} |
|
|
} |
|
|
``` |