SmolLM2-360M-Instruct-TaiwanChat / README.md

clean requirements and update reamde

d2c6d95 8 months ago

5.4 kB

	---
	library_name: peft
	license: apache-2.0
	base\_model: unsloth/SmolLM2-360M-Instruct
	tags:
	- unsloth
	- trl
	- sft
	- generated_from_trainer
	model-index:
	- name: SmolLM2-360M-Instruct-TaiwanChat
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/pesi/SmolLM2-360M-Instruct-TaiwanChat_CLOUD/runs/9fnxruem)
	[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/pesi/SmolLM2-360M-Instruct-TaiwanChat_CLOUD/runs/9fnxruem)
	# SmolLM2-360M-Instruct-TaiwanChat

	This model is a fine-tuned version of [unsloth/SmolLM2-360M-Instruct](https://huggingface.co/unsloth/SmolLM2-360M-Instruct) on the TaiwanChat dataset using Unsloth’s 4-bit quantization and LoRA adapters for efficient instruction-following in Traditional Chinese.

	## Installation

	```bash
	pip install -r requirements.txt
	```

	## Requirements

	* Python: 3.8 or higher
	* CUDA: 11.0 or higher (for GPU support)
	* All other dependencies and exact versions are specified in [requirements.txt](requirements.txt).

	## Model description

	* Base: SmolLM2-360M-Instruct (360M parameters)
	* Quantization: 4-bit weight quantization (activations in full precision)
	* Adapters: LoRA with rank `r=16`, alpha `α=16`, dropout `0.0`, applied to projection layers (`q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`) citeturn2file0
	* Dataset: TaiwanChat (`yentinglin/TaiwanChat`) — 600 k filtered examples, max length 512, streamed and deduplicated, then split 90% train / 10% validation citeturn2file0

	## Intended uses & limitations

	Intended uses:

	* Conversational AI and chatbots handling Traditional Chinese queries (e.g., weather, FAQs).
	* Instruction-following in a dialogue format.

	Limitations:

	* Limited capacity may cause occasional hallucinations or vague answers.
	* Performance measured on a 10% hold-out; real-world data discrepancies may impact quality.
	* Quantization and adapter-based tuning trade off some accuracy for efficiency.

	## Training procedure

	1. Data preparation

	* Streamed 600 k examples from HF dataset, filtered to `max_len=512`, cleaned assistant markers via regex, then shuffled and split with `Dataset.train_test_split(test_size=0.1)` citeturn2file0

	2. Model & training setup

	* Loaded base with `FastLanguageModel.from_pretrained(..., load_in_4bit=True, full_finetuning=False)`
	* Applied LoRA adapters via `FastLanguageModel.get_peft_model(...)`
	* Used `LoggingSFTTrainer` subclass to catch empty-label and NaN-loss cases during eval citeturn2file0

	3. Hyperparameters

	\| Parameter \| Value \|
	\| -------------------------------- \| -----------------: \|
	\| `num_train_epochs` \| 3 \|
	\| `per_device_train_batch_size` \| 40 \|
	\| `gradient_accumulation_steps` \| 1 \|
	\| `per_device_eval_batch_size` \| 1 \|
	\| `learning_rate` \| 2e-4 \|
	\| `weight_decay` \| 0.01 \|
	\| `warmup_steps` \| 500 \|
	\| `max_seq_length` \| 512 \|
	\| `evaluation_strategy` \| steps (every 100) \|
	\| `eval_steps` \| 100 \|
	\| `save_strategy` \| steps (every 1000) \|
	\| `logging_steps` \| 50 \|
	\| `optimizer` \| adamw\_8bit \|
	\| `gradient_checkpointing` \| false \|
	\| `seed` \| 3407 \|
	\| `EarlyStoppingCallback patience` \| 4 evals \|

	4. Training & push

	* Ran `trainer.train()`, merged LoRA weights, then pushed the merged 16-bit model to `Luigi/SmolLM2-360M-Instruct-TaiwanChat` on Hugging Face via `model.push_to_hub_merged()` citeturn2file0

	## Example inference

	```python
	from transformers import AutoTokenizer
	from peft import PeftModel

	# Load merged model
	tokenizer = AutoTokenizer.from_pretrained("Luigi/SmolLM2-360M-Instruct-TaiwanChat")
	model = PeftModel.from_pretrained(
	"Luigi/SmolLM2-360M-Instruct-TaiwanChat",
	torch_dtype=torch.float16,
	).eval().to("cuda")

	# Query
	test_prompt = "請問台北今天的天氣如何？"
	inputs = tokenizer(test_prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(
	**inputs,
	max_new_tokens=100,
	do_sample=True,
	temperature=0.8,
	)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## Framework versions

	```text
	bitsandbytes==0.45.5
	datasets==3.2.0
	hatchet==1.4.0
	importlib_metadata==8.6.1
	lit==18.1.8
	matplotlib
	numpy
	packaging
	pandas
	psutil==6.1.1
	pybind11==2.13.6
	pytest==8.1.1
	redis==6.0.0
	scipy
	setuptools==70.3.0
	Sphinx
	sphinx_gallery
	sphinx_rtd_theme
	tabulate==0.9.0
	torch==2.7.0
	transformers==4.47.1
	trl==0.15.2
	unsloth==2025.4.1
	unsloth_zoo==2025.4.2
	cut_cross_entropy
	wandb
	wheel==0.45.1
	```