CVPRW26-ChartLens / README.md
nielsr's picture
nielsr HF Staff
Add pipeline tag, library name and link to paper
0c2cb03 verified
|
raw
history blame
4.94 kB
---
library_name: peft
license: apache-2.0
base_model: ibm-granite/granite-vision-4.1-4b
pipeline_tag: image-text-to-text
tags:
- pytorch
- lora
- chart-understanding
---
<a id="top"></a>
<div align="center">
<h1>πŸš€ ChartLens @ CVPR 2026 DataMFM Chart Understanding Challenge</h1>
<p>
<b>Hao Liu</b><sup>1</sup>&nbsp;
<b>Ruping Cao</b><sup>1</sup>&nbsp;
<b>Kun Wang</b><sup>1</sup>&nbsp;
<b>Zhiran Li</b><sup>1</sup>&nbsp;
<b>Fan Liu</b><sup>2</sup>&nbsp;
<b>Yupeng Hu</b><sup>1</sup>&nbsp;
<b>Liqiang Nie</b><sup>3</sup>
</p>
<p>
<sup>1</sup>Shandong University<br>
<sup>2</sup>Southeast University<br>
<sup>3</sup>Harbin Institute of Technology (Shenzhen)
</p>
</div>
These are the official implementation resources, model weights, and prediction files for **ChartLens**, the champion solution for **DataMFM Challenge Track 2: Chart Understanding** at CVPR 2026.
πŸ”— **Paper:** [ChartLens: A Dual-Branch Framework for Chart Data Correction and Factual Summary Refinement](https://huggingface.co/papers/2606.10640)
πŸ”— **GitHub Repository:** [iLearnLab/CVPRW26-ChartLens](https://github.com/iLearnLab/CVPRW26-ChartLens)
πŸ”— **Challenge Page:** [DataMFM Challenge](https://datamfm.github.io/challenge.html)
---
## πŸ“Œ Model Information
### 1. Model Name
**ChartLens: A Dual-Branch Framework for Chart Data Correction and Factual Summary Refinement**
### 2. Task Type & Applicable Tasks
- **Task Type:** Chart Understanding / Multimodal Document Understanding
- **Applicable Tasks:** Chart-to-CSV extraction and chart-to-summary generation from chart images.
### 3. Project Introduction
Chart understanding requires models to recover structured chart data and generate faithful natural-language summaries from chart images. **ChartLens** addresses these complementary goals with a dual-branch, verification-guided correction framework.
> πŸ’‘ **Method Highlight:** ChartLens combines Granite-Vision-4.1-4B LoRA adaptation with two correction branches: **Structure-Aware CSV Verification and Correction (SAVC)** for reliable table recovery, and **Text-Retention-Guided Summary Refinement (TRSR)** for OCR-assisted factual summary repair. SAVC checks structure, completeness, and numerical accuracy, while TRSR preserves visible chart text such as titles, legends, annotations, sources, and numerical evidence.
### 4. Training Data Source
- Released ChartNet-based training data for LoRA adaptation.
- DataMFM Challenge chart understanding splits, including `real` and `synthetic` chart images.
### 5. Challenge Results
| Method | CSV Numeric F1 | CSV Structural Score | Summary ROUGE-L | Summary Numeric Fact F1 | Overall |
|--------|---------------:|---------------------:|----------------:|------------------------:|--------:|
| **ChartLens (Ours)** | **80.62** | **75.66** | **45.57** | **74.55** | **69.10** |
ChartLens ranked **1st place** on DataMFM Challenge Track 2.
---
## πŸš€ Usage & Basic Inference
### Step 1: Prepare the Environment
Clone the GitHub repository and set up the Conda environment:
```bash
git clone https://github.com/iLearnLab/CVPRW26-ChartLens.git
cd CVPRW26-ChartLens
```
```bash
conda create -n chartlens python=3.10 -y
conda activate chartlens
pip install -r requirements.txt
```
### Step 2: Data & Weights Preparation
1. **Challenge Data:** Use the datasets and splits released by the [DataMFM Challenge](https://datamfm.github.io/challenge.html). The chart understanding track contains `real` and `synthetic` splits.
2. **ChartLens Checkpoints:** Download the model weights from this Hugging Face repository.
3. **Granite Vision Backbone:** Prepare the Granite-Vision-4.1-4B backbone and update the local `--model_path` argument when running inference.
To prepare ChartNet SFT data for LoRA training:
```bash
python code/load_chartnet_500.py \
--out_dir Fine-tuning/Dataset/raw \
--num_samples 500
python code/build_chartnet_sft.py \
--gt_path Fine-tuning/Dataset/raw/gt.jsonl \
--image_dir Fine-tuning/Dataset/raw/images \
--out_dir Fine-tuning/Dataset/sft \
--csv_repeat 2 \
--summary_repeat 1
```
### Step 3: Run Granite Vision + LoRA Inference
```bash
python code/infer_granite_with_lora.py \
--image_root /path/to/data \
--out_root /path/to/output \
--model_path /path/to/granite-vision-4.1-4b \
--lora_path /path/to/chartlens_lora \
--gpu_id 0 \
--splits real synthetic
```
Use `code/infer_chartnet_granite.py` for base Granite Vision inference without a LoRA adapter.
---
## πŸ“β­οΈ Citation
If you find this project useful for your research, please consider citing:
```bibtex
@article{liu2026chartlens,
title={ChartLens: A Dual-Branch Framework for Chart Data Correction and Factual Summary Refinement},
author={Liu, Hao and Cao, Ruping and Wang, Kun and Li, Zhiran and Liu, Fan and Hu, Yupeng and Nie, Liqiang},
journal={arXiv preprint arXiv:2606.10640},
year={2026}
}
```