Qwen3-VL-2B-ChartQA-SFT
Fine-tuned Qwen3-VL-2B-Thinking on the ChartQA dataset for advanced chart understanding, data extraction, and visual reasoning.
Model Description
This model is a specialized version of Qwen3-VL-2B-Thinking, fine-tuned using LoRA (Low-Rank Adaptation). It is designed to perform "System 2" thinking—generating step-by-step reasoning traces before providing the final answer—making the decision process regarding chart interpretation transparent and verifiable.
Key Features
- 🧠Explicit Reasoning: Uses
<think></think>tags to articulate the visual analysis process (e.g., identifying axes, comparing bars) before concluding. - 📊 Chart Specialization: Optimized for bar charts, line graphs, and pie charts found in the ChartQA dataset.
- âš¡ Efficient: Fine-tuned with LoRA (~2% trainable parameters) while retaining the base model's general capabilities.
- 🎯 Structured Output: Trained to return answers in a strict JSON format
{"answer": "value"}for easy programmatic parsing.
Training Details
Base Model
- Model: Qwen/Qwen3-VL-2B-Thinking
- Parameters: 2B
- Architecture: Vision-Language Model with built-in Chain-of-Thought (CoT) capability.
Fine-tuning Configuration
- Method: LoRA (Low-Rank Adaptation)
- Rank (r): 16
- Alpha: 32
- Target Modules:
q_proj,k_proj,v_proj,o_proj - Trainable Parameters: ~2%
Training Hyperparameters
- Epochs: 1
- Batch Size: 1 (with Gradient Accumulation = 8)
- Learning Rate: 2e-5
- Optimizer: AdamW
- Scheduler: Cosine
- Precision: FP16
- Max Sequence Length: 2048
Usage
Installation
pip install git+[https://github.com/huggingface/transformers](https://github.com/huggingface/transformers) # Install latest for Qwen3 support
pip install torch pillow
- Downloads last month
- 154
Model tree for Nhaass/Qwen3-VL-2B-ChartQA
Base model
Qwen/Qwen3-VL-2B-Thinking