File size: 4,944 Bytes

---
library_name: peft
license: apache-2.0
base_model: ibm-granite/granite-vision-4.1-4b
pipeline_tag: image-text-to-text
tags:
- pytorch
- lora
- chart-understanding
---

<a id="top"></a>
<div align="center">
  <h1>🚀 ChartLens @ CVPR 2026 DataMFM Chart Understanding Challenge</h1>

  <p>
    <b>Hao Liu</b><sup>1</sup>&nbsp;
    <b>Ruping Cao</b><sup>1</sup>&nbsp;
    <b>Kun Wang</b><sup>1</sup>&nbsp;
    <b>Zhiran Li</b><sup>1</sup>&nbsp;
    <b>Fan Liu</b><sup>2</sup>&nbsp;
    <b>Yupeng Hu</b><sup>1</sup>&nbsp;
    <b>Liqiang Nie</b><sup>3</sup>
  </p>

  <p>
    <sup>1</sup>Shandong University<br>
    <sup>2</sup>Southeast University<br>
    <sup>3</sup>Harbin Institute of Technology (Shenzhen)
  </p>
</div>

These are the official implementation resources, model weights, and prediction files for **ChartLens**, the champion solution for **DataMFM Challenge Track 2: Chart Understanding** at CVPR 2026.

🔗 **Paper:** [ChartLens: A Dual-Branch Framework for Chart Data Correction and Factual Summary Refinement](https://huggingface.co/papers/2606.10640)  
🔗 **GitHub Repository:** [iLearnLab/CVPRW26-ChartLens](https://github.com/iLearnLab/CVPRW26-ChartLens)  
🔗 **Challenge Page:** [DataMFM Challenge](https://datamfm.github.io/challenge.html)

---

## 📌 Model Information

### 1. Model Name
**ChartLens: A Dual-Branch Framework for Chart Data Correction and Factual Summary Refinement**

### 2. Task Type & Applicable Tasks
- **Task Type:** Chart Understanding / Multimodal Document Understanding
- **Applicable Tasks:** Chart-to-CSV extraction and chart-to-summary generation from chart images.

### 3. Project Introduction
Chart understanding requires models to recover structured chart data and generate faithful natural-language summaries from chart images. **ChartLens** addresses these complementary goals with a dual-branch, verification-guided correction framework.

> 💡 **Method Highlight:** ChartLens combines Granite-Vision-4.1-4B LoRA adaptation with two correction branches: **Structure-Aware CSV Verification and Correction (SAVC)** for reliable table recovery, and **Text-Retention-Guided Summary Refinement (TRSR)** for OCR-assisted factual summary repair. SAVC checks structure, completeness, and numerical accuracy, while TRSR preserves visible chart text such as titles, legends, annotations, sources, and numerical evidence.

### 4. Training Data Source
- Released ChartNet-based training data for LoRA adaptation.
- DataMFM Challenge chart understanding splits, including `real` and `synthetic` chart images.

### 5. Challenge Results

| Method | CSV Numeric F1 | CSV Structural Score | Summary ROUGE-L | Summary Numeric Fact F1 | Overall |
|--------|---------------:|---------------------:|----------------:|------------------------:|--------:|
| **ChartLens (Ours)** | **80.62** | **75.66** | **45.57** | **74.55** | **69.10** |

ChartLens ranked **1st place** on DataMFM Challenge Track 2.

---

## 🚀 Usage & Basic Inference

### Step 1: Prepare the Environment

Clone the GitHub repository and set up the Conda environment:

```bash
git clone https://github.com/iLearnLab/CVPRW26-ChartLens.git
cd CVPRW26-ChartLens
```

```bash
conda create -n chartlens python=3.10 -y
conda activate chartlens
pip install -r requirements.txt
```

### Step 2: Data & Weights Preparation

1. **Challenge Data:** Use the datasets and splits released by the [DataMFM Challenge](https://datamfm.github.io/challenge.html). The chart understanding track contains `real` and `synthetic` splits.
2. **ChartLens Checkpoints:** Download the model weights from this Hugging Face repository.
3. **Granite Vision Backbone:** Prepare the Granite-Vision-4.1-4B backbone and update the local `--model_path` argument when running inference.

To prepare ChartNet SFT data for LoRA training:

```bash
python code/load_chartnet_500.py \
  --out_dir Fine-tuning/Dataset/raw \
  --num_samples 500

python code/build_chartnet_sft.py \
  --gt_path Fine-tuning/Dataset/raw/gt.jsonl \
  --image_dir Fine-tuning/Dataset/raw/images \
  --out_dir Fine-tuning/Dataset/sft \
  --csv_repeat 2 \
  --summary_repeat 1
```

### Step 3: Run Granite Vision + LoRA Inference

```bash
python code/infer_granite_with_lora.py \
  --image_root /path/to/data \
  --out_root /path/to/output \
  --model_path /path/to/granite-vision-4.1-4b \
  --lora_path /path/to/chartlens_lora \
  --gpu_id 0 \
  --splits real synthetic
```

Use `code/infer_chartnet_granite.py` for base Granite Vision inference without a LoRA adapter.

---

## 📝⭐️ Citation

If you find this project useful for your research, please consider citing:

```bibtex
@article{liu2026chartlens,
  title={ChartLens: A Dual-Branch Framework for Chart Data Correction and Factual Summary Refinement},
  author={Liu, Hao and Cao, Ruping and Wang, Kun and Li, Zhiran and Liu, Fan and Hu, Yupeng and Nie, Liqiang},
  journal={arXiv preprint arXiv:2606.10640},
  year={2026}
}
```