|
|
--- |
|
|
library_name: transformers |
|
|
tags: [quantization, qwen3, qlora, causal-lm, low-rank-adapters, 4bit, bitsandbytes, peft, efficient-finetuning] |
|
|
--- |
|
|
|
|
|
# Qwen3-0.6B Quantized with QLoRA for Reasoning Tasks |
|
|
|
|
|
This is a 4-bit quantized version of `Qwen/Qwen3-0.6B-Base`, fine-tuned using LoRA adapters on multiple MCQA-style reasoning datasets. The model was optimized using QLoRA, a parameter-efficient tuning method with minimal memory footprint and minimal accuracy loss. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
|
|
|
This model is: |
|
|
- A quantized version of `Qwen/Qwen3-0.6B-Base` using `bitsandbytes` 4-bit NormalFloat (nf4) |
|
|
- Fine-tuned using Low-Rank Adaptation (LoRA) with rank 8 |
|
|
- Adapted to multiple-choice reasoning datasets like AQuA-RAT and TheoremQA |
|
|
- Fully compatible with Hugging Face Transformers |
|
|
|
|
|
- **Developed by:** Ahmed Abdelmalek (EPFL CS-552 Project) |
|
|
- **Model type:** Causal Language Model |
|
|
- **Language(s):** English |
|
|
- **License:** Apache 2.0 |
|
|
- **Fine-tuned from model:** `Qwen/Qwen3-0.6B-Base` |
|
|
|
|
|
### Model Sources |
|
|
|
|
|
- [Repository](https://huggingface.co/Qwen/Qwen3-0.6B-Base) |
|
|
|
|
|
## Uses |
|
|
|
|
|
### Direct Use |
|
|
|
|
|
You can directly use this model for MCQA-style question-answering tasks using generation. |
|
|
|
|
|
### Out-of-Scope Use |
|
|
|
|
|
- Not intended for open-ended generation or safety-critical applications |
|
|
- Not intended for real-time or commercial deployment without evaluation |
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
|
|
- Inherits biases from its base model and training data (e.g., reasoning datasets) |
|
|
- May fail on adversarial or out-of-distribution logic tasks |
|
|
|
|
|
### Recommendations |
|
|
|
|
|
Evaluate the model against your specific reasoning task before production use. |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
|
|
model_id = "your-username/MNLP_M2_quantized_model" |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) |
|
|
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True) |
|
|
|
|
|
prompt = "Question: What is 3 + 5? |
|
|
Options: |
|
|
A) 6 |
|
|
B) 8 |
|
|
C) 9 |
|
|
D) 10 |
|
|
Answer:" |
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate(**inputs, max_new_tokens=50) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Data |
|
|
|
|
|
- Processed versions of AQuA-RAT, TheoremQA, and custom MCQA datasets |
|
|
- Unified into a single format with rationale-enhanced prompts |
|
|
|
|
|
### Training Procedure |
|
|
|
|
|
- **Precision:** fp16 |
|
|
- **Quantization:** 4-bit nf4 + double quant + float16 compute |
|
|
- **Adapter Type:** LoRA (r=8, α=16, dropout=0.05) |
|
|
- **Base model frozen** |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
|
|
- **Epochs:** 3 |
|
|
- **Batch size:** 4 |
|
|
- **Grad accum steps:** 2 |
|
|
- **Optimizer:** paged_adamw_8bit |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
### Testing Data |
|
|
|
|
|
Validation set with 1000 samples held out from the unified dataset. |
|
|
|
|
|
### Metrics |
|
|
|
|
|
- Accuracy / F1 (to be reported in evaluation phase) |
|
|
|
|
|
## Environmental Impact |
|
|
|
|
|
- **Hardware:** Google Colab Pro, GPU A100 |
|
|
- **Hours used:** ~6–7 hours |
|
|
- **Carbon Emitted:** Estimated with [MLCO2](https://mlco2.github.io/impact#compute) |
|
|
|
|
|
## Technical Specifications |
|
|
|
|
|
### Architecture |
|
|
|
|
|
- Qwen3-0.6B base |
|
|
- 28-layer transformer with rotary positional encoding and 16 heads |
|
|
|
|
|
### Compute Infrastructure |
|
|
|
|
|
- **Hardware:** Colab A100 GPU, High RAM |
|
|
- **Software:** Python 3.10, PyTorch 2.2.2, Transformers 4.51.3 |
|
|
|
|
|
## Contact |
|
|
|
|
|
- **Author:** Ahmed Abdelmalek |
|
|
- **Email:** ahmed.abdelmalek@epfl.ch |