Safetensors
GGUF
English
qwen3
qlora
reasoning
unsloth
conversational

Qwen3-4B Instruct No-Think

DEPRECATED: Please use VladHong/Qwen3-4B-Instruct-NoThink-V2.1

Finetuned from Qwen/Qwen3-4B-Instruct-2507 using QLoRA + Unsloth. Trained to respond directly without chain-of-thought (<think> blocks stripped).

Training Data

  • TeichAI/gemini-3-pro-preview-high-reasoning-250x
  • TeichAI/claude-haiku-4.5-high-reasoning-1700x
  • TeichAI/gemini-3-pro-preview-high-reasoning-1000x

~18k rows after deduplication. Think blocks stripped before training.

Training Details

  • Method: QLoRA (4-bit NF4) + Unsloth
  • LoRA rank: 16
  • Epochs: 1
  • Hardware: RTX 4050 Laptop 6GB

Files

  • *.gguf — IQ4_XS quantized, use in LM Studio / Ollama / llama.cpp
  • lora-adapter/ — Raw LoRA weights for merging with base model

Usage (Ollama)

ollama run VladHong/Qwen3-4B-Instruct-NoThink

License Note

Base model is Apache 2.0. Training data includes AI-generated content — review upstream dataset terms before commercial use.

Downloads last month
425
GGUF
Model size
4B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for VladHong/Qwen3-4B-Instruct-NoThink

Quantized
(226)
this model

Datasets used to train VladHong/Qwen3-4B-Instruct-NoThink