Add pipeline tag, library name and link to paper (#1)

f2de332 about 18 hours ago

4.94 kB

	---
	library_name: peft
	license: apache-2.0
	base_model: ibm-granite/granite-vision-4.1-4b
	pipeline_tag: image-text-to-text
	tags:
	- pytorch
	- lora
	- chart-understanding
	---

	<a id="top"></a>
	<div align="center">
	<h1>🚀 ChartLens @ CVPR 2026 DataMFM Chart Understanding Challenge</h1>

	<p>
	<b>Hao Liu</b><sup>1</sup>
	<b>Ruping Cao</b><sup>1</sup>
	<b>Kun Wang</b><sup>1</sup>
	<b>Zhiran Li</b><sup>1</sup>
	<b>Fan Liu</b><sup>2</sup>
	<b>Yupeng Hu</b><sup>1</sup>
	<b>Liqiang Nie</b><sup>3</sup>
	</p>

	<p>
	<sup>1</sup>Shandong University<br>
	<sup>2</sup>Southeast University<br>
	<sup>3</sup>Harbin Institute of Technology (Shenzhen)
	</p>
	</div>

	These are the official implementation resources, model weights, and prediction files for ChartLens, the champion solution for DataMFM Challenge Track 2: Chart Understanding at CVPR 2026.

	🔗 Paper: [ChartLens: A Dual-Branch Framework for Chart Data Correction and Factual Summary Refinement](https://huggingface.co/papers/2606.10640)
	🔗 GitHub Repository: [iLearnLab/CVPRW26-ChartLens](https://github.com/iLearnLab/CVPRW26-ChartLens)
	🔗 Challenge Page: [DataMFM Challenge](https://datamfm.github.io/challenge.html)

	---

	## 📌 Model Information

	### 1. Model Name
	ChartLens: A Dual-Branch Framework for Chart Data Correction and Factual Summary Refinement

	### 2. Task Type & Applicable Tasks
	- Task Type: Chart Understanding / Multimodal Document Understanding
	- Applicable Tasks: Chart-to-CSV extraction and chart-to-summary generation from chart images.

	### 3. Project Introduction
	Chart understanding requires models to recover structured chart data and generate faithful natural-language summaries from chart images. ChartLens addresses these complementary goals with a dual-branch, verification-guided correction framework.

	> 💡 Method Highlight: ChartLens combines Granite-Vision-4.1-4B LoRA adaptation with two correction branches: Structure-Aware CSV Verification and Correction (SAVC) for reliable table recovery, and Text-Retention-Guided Summary Refinement (TRSR) for OCR-assisted factual summary repair. SAVC checks structure, completeness, and numerical accuracy, while TRSR preserves visible chart text such as titles, legends, annotations, sources, and numerical evidence.

	### 4. Training Data Source
	- Released ChartNet-based training data for LoRA adaptation.
	- DataMFM Challenge chart understanding splits, including `real` and `synthetic` chart images.

	### 5. Challenge Results

	\| Method \| CSV Numeric F1 \| CSV Structural Score \| Summary ROUGE-L \| Summary Numeric Fact F1 \| Overall \|
	\|--------\|---------------:\|---------------------:\|----------------:\|------------------------:\|--------:\|
	\| ChartLens (Ours) \| 80.62 \| 75.66 \| 45.57 \| 74.55 \| 69.10 \|

	ChartLens ranked 1st place on DataMFM Challenge Track 2.

	---

	## 🚀 Usage & Basic Inference

	### Step 1: Prepare the Environment

	Clone the GitHub repository and set up the Conda environment:

	```bash
	git clone https://github.com/iLearnLab/CVPRW26-ChartLens.git
	cd CVPRW26-ChartLens
	```

	```bash
	conda create -n chartlens python=3.10 -y
	conda activate chartlens
	pip install -r requirements.txt
	```

	### Step 2: Data & Weights Preparation

	1. Challenge Data: Use the datasets and splits released by the [DataMFM Challenge](https://datamfm.github.io/challenge.html). The chart understanding track contains `real` and `synthetic` splits.
	2. ChartLens Checkpoints: Download the model weights from this Hugging Face repository.
	3. Granite Vision Backbone: Prepare the Granite-Vision-4.1-4B backbone and update the local `--model_path` argument when running inference.

	To prepare ChartNet SFT data for LoRA training:

	```bash
	python code/load_chartnet_500.py \
	--out_dir Fine-tuning/Dataset/raw \
	--num_samples 500

	python code/build_chartnet_sft.py \
	--gt_path Fine-tuning/Dataset/raw/gt.jsonl \
	--image_dir Fine-tuning/Dataset/raw/images \
	--out_dir Fine-tuning/Dataset/sft \
	--csv_repeat 2 \
	--summary_repeat 1
	```

	### Step 3: Run Granite Vision + LoRA Inference

	```bash
	python code/infer_granite_with_lora.py \
	--image_root /path/to/data \
	--out_root /path/to/output \
	--model_path /path/to/granite-vision-4.1-4b \
	--lora_path /path/to/chartlens_lora \
	--gpu_id 0 \
	--splits real synthetic
	```

	Use `code/infer_chartnet_granite.py` for base Granite Vision inference without a LoRA adapter.

	---

	## 📝⭐️ Citation

	If you find this project useful for your research, please consider citing:

	```bibtex
	@article{liu2026chartlens,
	title={ChartLens: A Dual-Branch Framework for Chart Data Correction and Factual Summary Refinement},
	author={Liu, Hao and Cao, Ruping and Wang, Kun and Li, Zhiran and Liu, Fan and Hu, Yupeng and Nie, Liqiang},
	journal={arXiv preprint arXiv:2606.10640},
	year={2026}
	}
	```