Qwen3-VL-2B-ChartQA-SFT

Fine-tuned Qwen3-VL-2B-Thinking on the ChartQA dataset for advanced chart understanding, data extraction, and visual reasoning.

Model Description

This model is a specialized version of Qwen3-VL-2B-Thinking, fine-tuned using LoRA (Low-Rank Adaptation). It is designed to perform "System 2" thinking—generating step-by-step reasoning traces before providing the final answer—making the decision process regarding chart interpretation transparent and verifiable.

Key Features

🧠 Explicit Reasoning: Uses <think></think> tags to articulate the visual analysis process (e.g., identifying axes, comparing bars) before concluding.
📊 Chart Specialization: Optimized for bar charts, line graphs, and pie charts found in the ChartQA dataset.
⚡ Efficient: Fine-tuned with LoRA (~2% trainable parameters) while retaining the base model's general capabilities.
🎯 Structured Output: Trained to return answers in a strict JSON format {"answer": "value"} for easy programmatic parsing.

Training Details

Base Model

Model: Qwen/Qwen3-VL-2B-Thinking
Parameters: 2B
Architecture: Vision-Language Model with built-in Chain-of-Thought (CoT) capability.

Fine-tuning Configuration

Method: LoRA (Low-Rank Adaptation)
Rank (r): 16
Alpha: 32
Target Modules: q_proj, k_proj, v_proj, o_proj
Trainable Parameters: ~2%

Training Hyperparameters

Epochs: 1
Batch Size: 1 (with Gradient Accumulation = 8)
Learning Rate: 2e-5
Optimizer: AdamW
Scheduler: Cosine
Precision: FP16
Max Sequence Length: 2048

Usage

Installation

pip install git+[https://github.com/huggingface/transformers](https://github.com/huggingface/transformers) # Install latest for Qwen3 support
pip install torch pillow

Downloads last month: 6

Safetensors

Model size

2B params

Tensor type

F16

Model tree for Nhaass/Qwen3-VL-2B-ChartQA

Base model

Qwen/Qwen3-VL-2B-Thinking

Adapter

(11)

this model