|
|
--- |
|
|
library_name: peft |
|
|
license: apache-2.0 |
|
|
base\_model: unsloth/SmolLM2-360M-Instruct |
|
|
tags: |
|
|
- unsloth |
|
|
- trl |
|
|
- sft |
|
|
- generated_from_trainer |
|
|
model-index: |
|
|
- name: SmolLM2-360M-Instruct-TaiwanChat |
|
|
results: [] |
|
|
--- |
|
|
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
|
|
[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/pesi/SmolLM2-360M-Instruct-TaiwanChat_CLOUD/runs/9fnxruem) |
|
|
[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/pesi/SmolLM2-360M-Instruct-TaiwanChat_CLOUD/runs/9fnxruem) |
|
|
# SmolLM2-360M-Instruct-TaiwanChat |
|
|
|
|
|
This model is a fine-tuned version of [unsloth/SmolLM2-360M-Instruct](https://huggingface.co/unsloth/SmolLM2-360M-Instruct) on the TaiwanChat dataset using Unsloth’s 4-bit quantization and LoRA adapters for efficient instruction-following in Traditional Chinese. |
|
|
|
|
|
## Installation |
|
|
|
|
|
```bash |
|
|
pip install -r requirements.txt |
|
|
``` |
|
|
|
|
|
## Requirements |
|
|
|
|
|
* **Python**: 3.8 or higher |
|
|
* **CUDA**: 11.0 or higher (for GPU support) |
|
|
* All other dependencies and exact versions are specified in [requirements.txt](requirements.txt). |
|
|
|
|
|
## Model description |
|
|
|
|
|
* **Base**: SmolLM2-360M-Instruct (360M parameters) |
|
|
* **Quantization**: 4-bit weight quantization (activations in full precision) |
|
|
* **Adapters**: LoRA with rank `r=16`, alpha `α=16`, dropout `0.0`, applied to projection layers (`q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`) citeturn2file0 |
|
|
* **Dataset**: TaiwanChat (`yentinglin/TaiwanChat`) — 600 k filtered examples, max length 512, streamed and deduplicated, then split 90% train / 10% validation citeturn2file0 |
|
|
|
|
|
## Intended uses & limitations |
|
|
|
|
|
**Intended uses:** |
|
|
|
|
|
* Conversational AI and chatbots handling Traditional Chinese queries (e.g., weather, FAQs). |
|
|
* Instruction-following in a dialogue format. |
|
|
|
|
|
**Limitations:** |
|
|
|
|
|
* Limited capacity may cause occasional hallucinations or vague answers. |
|
|
* Performance measured on a 10% hold-out; real-world data discrepancies may impact quality. |
|
|
* Quantization and adapter-based tuning trade off some accuracy for efficiency. |
|
|
|
|
|
## Training procedure |
|
|
|
|
|
1. **Data preparation** |
|
|
|
|
|
* Streamed 600 k examples from HF dataset, filtered to `max_len=512`, cleaned assistant markers via regex, then shuffled and split with `Dataset.train_test_split(test_size=0.1)` citeturn2file0 |
|
|
|
|
|
2. **Model & training setup** |
|
|
|
|
|
* Loaded base with `FastLanguageModel.from_pretrained(..., load_in_4bit=True, full_finetuning=False)` |
|
|
* Applied LoRA adapters via `FastLanguageModel.get_peft_model(...)` |
|
|
* Used `LoggingSFTTrainer` subclass to catch empty-label and NaN-loss cases during eval citeturn2file0 |
|
|
|
|
|
3. **Hyperparameters** |
|
|
|
|
|
| Parameter | Value | |
|
|
| -------------------------------- | -----------------: | |
|
|
| `num_train_epochs` | 3 | |
|
|
| `per_device_train_batch_size` | 40 | |
|
|
| `gradient_accumulation_steps` | 1 | |
|
|
| `per_device_eval_batch_size` | 1 | |
|
|
| `learning_rate` | 2e-4 | |
|
|
| `weight_decay` | 0.01 | |
|
|
| `warmup_steps` | 500 | |
|
|
| `max_seq_length` | 512 | |
|
|
| `evaluation_strategy` | steps (every 100) | |
|
|
| `eval_steps` | 100 | |
|
|
| `save_strategy` | steps (every 1000) | |
|
|
| `logging_steps` | 50 | |
|
|
| `optimizer` | adamw\_8bit | |
|
|
| `gradient_checkpointing` | false | |
|
|
| `seed` | 3407 | |
|
|
| `EarlyStoppingCallback patience` | 4 evals | |
|
|
|
|
|
4. **Training & push** |
|
|
|
|
|
* Ran `trainer.train()`, merged LoRA weights, then pushed the merged 16-bit model to `Luigi/SmolLM2-360M-Instruct-TaiwanChat` on Hugging Face via `model.push_to_hub_merged()` citeturn2file0 |
|
|
|
|
|
## Example inference |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer |
|
|
from peft import PeftModel |
|
|
|
|
|
# Load merged model |
|
|
tokenizer = AutoTokenizer.from_pretrained("Luigi/SmolLM2-360M-Instruct-TaiwanChat") |
|
|
model = PeftModel.from_pretrained( |
|
|
"Luigi/SmolLM2-360M-Instruct-TaiwanChat", |
|
|
torch_dtype=torch.float16, |
|
|
).eval().to("cuda") |
|
|
|
|
|
# Query |
|
|
test_prompt = "請問台北今天的天氣如何?" |
|
|
inputs = tokenizer(test_prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=100, |
|
|
do_sample=True, |
|
|
temperature=0.8, |
|
|
) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
## Framework versions |
|
|
|
|
|
```text |
|
|
bitsandbytes==0.45.5 |
|
|
datasets==3.2.0 |
|
|
hatchet==1.4.0 |
|
|
importlib_metadata==8.6.1 |
|
|
lit==18.1.8 |
|
|
matplotlib |
|
|
numpy |
|
|
packaging |
|
|
pandas |
|
|
psutil==6.1.1 |
|
|
pybind11==2.13.6 |
|
|
pytest==8.1.1 |
|
|
redis==6.0.0 |
|
|
scipy |
|
|
setuptools==70.3.0 |
|
|
Sphinx |
|
|
sphinx_gallery |
|
|
sphinx_rtd_theme |
|
|
tabulate==0.9.0 |
|
|
torch==2.7.0 |
|
|
transformers==4.47.1 |
|
|
trl==0.15.2 |
|
|
unsloth==2025.4.1 |
|
|
unsloth_zoo==2025.4.2 |
|
|
cut_cross_entropy |
|
|
wandb |
|
|
wheel==0.45.1 |
|
|
``` |
|
|
|