Qwen3-4B Instruct No-Think V2.1

Finetuned from Unsloth/Qwen3-4B-Instruct-2507 using QLoRA + Unsloth. Trained to respond directly without chain-of-thought (<think> blocks stripped from all training data).

What's New in V2.1: The best I can afford on free Kaggle GPU

V2.1 further expands the training data to 7 datasets covering Gemini, Claude, and GPT-sourced reasoning traces, and was trained on Kaggle's dual T4 GPU environment (though only one GPU was used for training). Dataset size after deduplication contains ~4,752 examples with MinHash-based similarity filtering at 0.4 Jaccard threshold.

Training Data

Dataset Source Model
TeichAI/gemini-3-pro-preview-high-reasoning-250x Gemini
TeichAI/gemini-3-pro-preview-high-reasoning-1000x Gemini
Roman1111111/gemini-3.1-pro-hard-high-reasoning Gemini
crownelius/Opus-4.6-Reasoning-3300x Claude
TeichAI/claude-4.5-opus-high-reasoning-250x Claude
TeichAI/claude-sonnet-4.5-high-reasoning-250x Claude
TeichAI/gpt-5.2-high-reasoning-250x GPT

~7,322 rows loaded, ~4,752 after MinHash deduplication (threshold 0.4). <think> blocks stripped from all assistant turns before training.

Training Details

Parameter Value
Method QLoRA (4-bit NF4) + Unsloth
LoRA rank 16
LoRA alpha 16
Epochs 1
Steps 594
Batch size 2 per device × 4 gradient accumulation = 8 effective
Learning rate 1e-4 (cosine schedule)
Max seq length 2048
Optimizer AdamW 8-bit
Hardware Kaggle Tesla T4 (14.56 GB VRAM)
Training time ~7h 23min
Trainable params 33M / 4.05B (0.81%)
Peak VRAM ~10.2 GB

Training used train_on_responses_only — loss computed on assistant completions only.

Files

  • *.gguf — IQ4_XS quantized, ready for LM Studio / Ollama / llama.cpp
  • lora-adapter/ — Raw LoRA weights for merging with the base model

Usage (Ollama)

ollama run VladHong/Qwen3-4B-Instruct-NoThink-V2.1

License Note

Base model is Apache 2.0. Training data includes AI-generated content — review upstream dataset terms before commercial use.

Downloads last month
371
GGUF
Model size
4B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train VladHong/Qwen3-4B-Instruct-NoThink-V2.1