fremko
/

Qwen3-VL-8B-Thinking-norm

Model card Files Files and versions

Qwen3-VL-8B-Thinking - Gemini 3 Distill scale 6

Proper grad norm and alpha. Fixed template

Fine-tuned on 1k dataset distilled from Gemini 3.

Base Model

unsloth/Qwen3-VL-8B-Thinking

Training

Dataset: 1k Gemini 3 distillation samples

Downloads last month: 2

Safetensors

Model size

9B params

Tensor type

BF16

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for fremko/Qwen3-VL-8B-Thinking-norm

Base model

Qwen/Qwen3-VL-8B-Thinking

Finetuned

unsloth/Qwen3-VL-8B-Thinking

Finetuned

(7)

this model

Quantizations