Qwen2.5-VL-3B-ScienceQA-Math-CoT

A LoRA fine-tuned version of Qwen2.5-VL-3B-Instruct for mathematical reasoning and multimodal question answering using the ScienceQA-Math-CoT dataset.

The model is designed to solve image-based and text-based mathematics problems through step-by-step reasoning while improving instruction-following capabilities in both Turkish and English.

Model Description

This model was fine-tuned from Qwen2.5-VL-3B-Instruct using Parameter-Efficient Fine-Tuning (LoRA).

The training objective focused on:

  • Mathematical reasoning
  • Chain-of-thought style explanations
  • Visual question answering
  • ScienceQA mathematical problems
  • Multimodal image-text understanding
  • Turkish and English instruction following

Base Model

  • Qwen2.5-VL-3B-Instruct

Fine-Tuning Method

  • QLoRA (4-bit)
  • LoRA adapters
  • PEFT
  • Hugging Face Transformers

Training Dataset

ScienceQA-Math-CoT

The dataset contains:

  • ScienceQA mathematical questions
  • Associated images
  • Step-by-step solutions
  • Final answers

The model was trained to generate reasoning traces before producing final answers.

Intended Uses

Suitable Uses

  • Educational assistants
  • Mathematical tutoring
  • Visual mathematical reasoning
  • STEM learning applications
  • Homework support
  • Science question answering
  • Turkish and English multimodal assistants

Out-of-Scope Uses

This model is not intended for:

  • Medical diagnosis
  • Legal advice
  • Financial decision-making
  • Safety-critical systems
  • Autonomous decision-making

Training Details

Hardware

  • Kaggle Dual NVIDIA T4 GPUs

Training Configuration

Parameter Value
Base Model Qwen2.5-VL-3B-Instruct
Fine-Tuning LoRA
Quantization 4-bit NF4
Precision FP16
LoRA Rank 16
LoRA Alpha 32
Learning Rate 2e-4
Batch Size 1
Gradient Accumulation 4
Epochs 1

Example Usage

from transformers import Qwen2_5_VLForConditionalGeneration
from transformers import AutoProcessor
from peft import PeftModel

base_model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
    "Qwen/Qwen2.5-VL-3B-Instruct",
    device_map="auto"
)

model = PeftModel.from_pretrained(
    base_model,
    "salihfurkaan/Qwen2.5-VL-3B-ScienceQA-Math-CoT-Adapter"
)

processor = AutoProcessor.from_pretrained(
    "Qwen/Qwen2.5-VL-3B-Instruct"
)

Limitations

  • The model may still produce incorrect mathematical reasoning.
  • Chain-of-thought outputs do not guarantee correctness.
  • Performance depends on image quality and clarity.
  • The model has not been evaluated on all mathematical domains.
  • The model may hallucinate intermediate reasoning steps.

Ethical Considerations

  • This model is intended for educational and research purposes.
  • Users should independently verify mathematical solutions before relying on them in academic, professional, or real-world settings.

Citation

@misc{qwen2vl_scienceqa_math_cot, title={Qwen2.5-VL-3B-ScienceQA-Math-CoT}, author={Salih Furkan Erik, Kerem Berke Başak}, year={2026}, publisher={Hugging Face}, howpublished={\url{https://huggingface.co/}} }

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for salihfurkaan/Qwen2.5-VL-ScienceQA-Math-CoT-Adapter

Adapter
(197)
this model