Junteng
/

Chart_CLIP

Junteng commited on Apr 12, 2025

Commit

4370a9d

verified ·

1 Parent(s): 82ccb71

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,3 +1,33 @@
----
-license: mit
----

+# CLIP Model for Chart Understanding
+This repository contains the CLIP model implementation from our paper "[On the Perception Bottleneck of VLMs for Chart Understanding](https://arxiv.org/abs/2503.18435)".
+## Overview
+This CLIP model is specifically trained to address the perception bottleneck in Vision Language Models (VLMs) when processing and understanding charts and visualizations. Our work explores and aims to improve how CLIP effect its LVLMs.
+## Model Details
+- Model architecture: trained from openai/clip-vit-large-patch14-336
+- Training data: from our collected and synthetic hard negatives chart data([Vision4Chart Dataset](https://huggingface.co/datasets/Junteng/Vision4Chart))
+- Training method: NegCLIP Training
+## Citation
+If you find this model useful in your research, please consider citing our paper:
+```bibtex
+@misc{liu2025perceptionbottleneckvlmschart,
+      title={On the Perception Bottleneck of VLMs for Chart Understanding},
+      author={Junteng Liu and Weihao Zeng and Xiwen Zhang and Yijun Wang and Zifei Shan and Junxian He},
+      year={2025},
+      eprint={2503.18435},
+      archivePrefix={arXiv},
+      primaryClass={cs.CV},
+      url={https://arxiv.org/abs/2503.18435},
+}
+```