--- license: apache-2.0 base_model: Qwen/Qwen3-0.6B tags: - quantization - neural-compressor - qat - quantization-aware-training - qwen3 library_name: transformers pipeline_tag: text-generation --- # Qwen3-0.6B Quantized with QAT This model is a quantized version of `Qwen/Qwen3-0.6B` using **Quantization Aware Training (QAT)** with Intel Neural Compressor. ## 🚀 Model Details - **Base Model**: Qwen/Qwen3-0.6B - **Quantization Method**: Quantization Aware Training (QAT) - **Framework**: Intel Neural Compressor - **Model Size**: Significantly reduced from original - **Performance**: Maintains quality while improving efficiency ## 📊 Benefits ✅ **Smaller model size** - Reduced storage requirements ✅ **Faster inference** - Optimized for deployment ✅ **Lower memory usage** - More efficient resource utilization ✅ **Maintained quality** - QAT preserves model performance ## 💻 Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer # Load the quantized model model = AutoModelForCausalLM.from_pretrained("Thomaschtl/qwen3-0.6b-qat-test") tokenizer = AutoTokenizer.from_pretrained("Thomaschtl/qwen3-0.6b-qat-test") # Generate text prompt = "The future of AI is" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_length=100, do_sample=True, temperature=0.7) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## ⚙️ Quantization Details - **Training Method**: Quantization Aware Training - **Optimizer**: AdamW - **Learning Rate**: 5e-5 - **Batch Size**: 2 - **Epochs**: 1 (demo configuration) ## 🔧 Technical Info This model was quantized using Intel Neural Compressor's QAT approach, which: 1. Simulates quantization during training 2. Allows model weights to adapt to quantization 3. Maintains better accuracy than post-training quantization ## 📝 Citation If you use this model, please cite: ``` @misc{qwen3-qat, title={Qwen3-0.6B Quantized with QAT}, author={Thomaschtl}, year={2025}, publisher={Hugging Face}, url={https://huggingface.co/Thomaschtl/qwen3-0.6b-qat-test} } ``` ## ⚖️ License This model follows the same license as the base model (Apache 2.0).