language: - en license: apache-2.0 library_name: transformers tags: - quantization - optimum-quanto - qwen - int8 - cpu pipeline_tag: text-generation base_model: Qwen/Qwen2.5-0.5B-Instruct

Model Quantisation (Qwen2.5-0.5B, CPU Only)

This technique is based on Post-Training Quantization (PTQ) using Qwen2.5-0.5B-Instruct...


language: - en license: apache-2.0 library_name: transformers pipeline_tag: text-generation base_model: Qwen/Qwen2.5-0.5B-Instruct tags: - quantization - int8 - cpu - optimum-quanto - transformers model_type: qwen2

Model Quantisation (Qwen2.5-0.5B, CPU Only)

This Technique based on the PQT Post Quantization Training A model quantisation using Qwen2.5-0.5B-Instruct. The project shows how to quantise a model with Optimum Quanto and run it locally on a CPU. where a 32 bit Modal into 8 bit

Files

File Description
01_model_quantisation_guide.ipynb Notebook explaining quantisation concepts and examples
python quant_qwen.py Downloads and quantises the model, then saves it locally
run_quantized.py Loads the quantised model and generates responses
requirements.txt Project dependencies

Setup

Install dependencies:

pip install -r requirements.txt

Usage

1. Quantise the model

HuggingFace Token Need
$env:HF_TOKEN = "Your TOKEN"; python quant_qwen.py 
python quant_qwen.py 

This creates:

qwen-int8/

2. Run the quantised model

 python .\v1_quant_qwen.py      

Please enter your question: hi

INPUT : hi OUTPUT: Hello! How can I assist you today? Please let me know if there's anything specific you'd like to talk about or any questions you have. I'm here to help answer your queries.

3. Open the notebook

jupyter notebook 01_model_quantisation_guide.ipynb

What is Quantisation?

Quantisation reduces the precision of model weights (for example, FP32 → INT8).

Benefits:

  • Smaller model size
  • Lower memory usage
  • Faster inference
  • CPU-friendly deployment

Requirements

  • Python 3.10+
  • CPU-only machine supported
  • No NVIDIA GPU required
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Helllbos/Qwen_Quantised3.50.5b

Finetuned
(856)
this model