bahree
/

ModelAdaptationBook

knowledge-distillation

Model card Files Files and versions

ModelAdaptationBook / README.md

bahree's picture

Model card

077e790 verified 13 days ago

|

History Blame Contribute Delete

1.97 kB

	---
	license: apache-2.0
	base_model: Qwen/Qwen3-4B-Instruct-2507
	library_name: peft
	tags:
	- lora
	- sft
	- dpo
	- knowledge-distillation
	- fine-tuning
	- it-support
	---

	# Model Adaptation Book — companion models

	Trained artifacts for the book *LLM Customization and Fine-Tuning: Adaptation,
	Distillation, and Alignment* (Manning). Code:
	https://github.com/bahree/ModelAdaptationBook

	All are adaptations of `Qwen/Qwen3-4B-Instruct-2507` on a real IT-support
	dataset: Stack Exchange IT Q&A (Super User, Ask Ubuntu, Server Fault;
	CC-BY-SA-4.0) plus a small Databricks Dolly slice (CC-BY-SA-3.0) for
	general-capability retention. Each chapter's artifact is a subfolder, so you
	can follow along on any machine (including Apple Silicon) by pulling a trained
	model and running inference/eval, without training it yourself.

	\| Subfolder \| Chapter \| What \| Base \|
	\|---\|---\|---\|---\|
	\| `ch5-lora` \| 5 \| LoRA adapter \| Qwen3-4B-Instruct-2507 \|
	\| `ch6-sft` \| 6 \| full SFT model (standalone) \| (full fine-tune) \|
	\| `ch7-distilled` \| 7 \| distilled student (LoRA) \| Qwen3-4B-Instruct-2507 \|
	\| `ch8-dpo` \| 8 \| full DPO model (standalone) \| (full fine-tune) \|
	\| `ch8-dpo-lora` \| 8 \| LoRA-DPO adapter (single-card path) \| `ch6-sft` \|

	Load a full model:

	```python
	from transformers import AutoModelForCausalLM
	m = AutoModelForCausalLM.from_pretrained("bahree/ModelAdaptationBook", subfolder="ch6-sft")
	```

	Load an adapter (on its base):

	```python
	from transformers import AutoModelForCausalLM
	from peft import PeftModel
	base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-4B-Instruct-2507")
	m = PeftModel.from_pretrained(base, "bahree/ModelAdaptationBook", subfolder="ch5-lora")
	```

	Training these needs a CUDA 24 GB+ GPU (and the Ch8 full DPO uses multiple
	GPUs; the `ch8-dpo-lora` adapter is the single-card alternative). **Inference
	and evaluation** fit a single smaller GPU or Apple Silicon (MPS). See the book
	repo for exact commands, datasets, and full attribution.