Instructions to use Junteng/Chart_CLIP with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Junteng/Chart_CLIP with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-feature-extraction", model="Junteng/Chart_CLIP")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Junteng/Chart_CLIP", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Improve model card: Add metadata and GitHub link (#2)
Browse files- Improve model card: Add metadata and GitHub link (02c50a777bbe412ddd6216e40fd5f5031c429a02)
Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>
README.md
CHANGED
|
@@ -1,7 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# CLIP Model for Chart Understanding
|
| 2 |
|
| 3 |
This repository contains the CLIP model implementation from our paper "[On the Perception Bottleneck of VLMs for Chart Understanding](https://arxiv.org/abs/2503.18435)".
|
| 4 |
|
|
|
|
|
|
|
| 5 |
## Overview
|
| 6 |
|
| 7 |
This CLIP model is specifically trained to address the perception bottleneck in Vision Language Models (VLMs) when processing and understanding charts and visualizations. Our work explores and aims to improve how CLIP effect its LVLMs.
|
|
@@ -29,5 +38,4 @@ If you find this model useful in your research, please consider citing our paper
|
|
| 29 |
primaryClass={cs.CV},
|
| 30 |
url={https://arxiv.org/abs/2503.18435},
|
| 31 |
}
|
| 32 |
-
```
|
| 33 |
-
|
|
|
|
| 1 |
+
---
|
| 2 |
+
pipeline_tag: image-feature-extraction
|
| 3 |
+
library_name: transformers
|
| 4 |
+
datasets:
|
| 5 |
+
- Junteng/Vision4Chart
|
| 6 |
+
---
|
| 7 |
+
|
| 8 |
# CLIP Model for Chart Understanding
|
| 9 |
|
| 10 |
This repository contains the CLIP model implementation from our paper "[On the Perception Bottleneck of VLMs for Chart Understanding](https://arxiv.org/abs/2503.18435)".
|
| 11 |
|
| 12 |
+
**Code**: https://github.com/hkust-nlp/Vision4Chart
|
| 13 |
+
|
| 14 |
## Overview
|
| 15 |
|
| 16 |
This CLIP model is specifically trained to address the perception bottleneck in Vision Language Models (VLMs) when processing and understanding charts and visualizations. Our work explores and aims to improve how CLIP effect its LVLMs.
|
|
|
|
| 38 |
primaryClass={cs.CV},
|
| 39 |
url={https://arxiv.org/abs/2503.18435},
|
| 40 |
}
|
| 41 |
+
```
|
|
|