Qwen3-1.7B-alpaca-cleaned - GGUF Format

GGUF format quantizations for llama.cpp/Ollama.

Model Details

Related Models

Training Details

  • LoRA Rank: 16
  • Training Steps: 599
  • Training Loss: 1.3403
  • Max Seq Length: 4096
  • Training Mode: Full training

For complete training configuration, see the LoRA adapters repository/directory.

Usage

Available Quantizations

Quantization File Size Quality
F16 model.F16.gguf 3.21 GB Full precision (largest)
Q4_K_M model.Q4_K_M.gguf 1.03 GB Good balance (recommended)
Q6_K model.Q6_K.gguf 1.32 GB High quality
Q8_0 model.Q8_0.gguf 1.71 GB Very high quality, near original

With Ollama

# Create Modelfile with proper chat template (using F16 as example)
cat > Modelfile <<'EOF'
FROM ./outputs/Qwen3-1.7B-alpaca-cleaned/gguf/model.F16.gguf

TEMPLATE """<|im_start|>system
You are a helpful AI assistant.<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
"""

PARAMETER stop "<|im_start|>"
PARAMETER stop "<|im_end|>"
PARAMETER temperature 0.7
PARAMETER top_p 0.9
EOF

# Create and run model
ollama create qwen3-1.7b-alpaca-cleaned -f Modelfile
ollama run qwen3-1.7b-alpaca-cleaned "What is machine learning?"

With llama.cpp

# Run directly (using F16 as example)
llama-cli -m ./outputs/Qwen3-1.7B-alpaca-cleaned/gguf/model.F16.gguf -p "Hello!"

License

Based on unsloth/Qwen3-1.7B-unsloth-bnb-4bit and trained on yahma/alpaca-cleaned. Please refer to the original model and dataset licenses.

Framework Versions

  • Unsloth: 2025.11.3
  • Transformers: 4.57.1
  • PyTorch: 2.9.0+cu128

Generated: 2025-11-22 10:35:01

Downloads last month
39
GGUF
Model size
2B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

4-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for fs90/Qwen3-1.7B-alpaca-cleaned-GGUF

Finetuned
Qwen/Qwen3-1.7B
Quantized
(35)
this model