ModelAdaptationBook / README.md
bahree's picture
Model card
077e790 verified
|
Raw
History Blame Contribute Delete
1.97 kB
metadata
license: apache-2.0
base_model: Qwen/Qwen3-4B-Instruct-2507
library_name: peft
tags:
  - lora
  - sft
  - dpo
  - knowledge-distillation
  - fine-tuning
  - it-support

Model Adaptation Book — companion models

Trained artifacts for the book LLM Customization and Fine-Tuning: Adaptation, Distillation, and Alignment (Manning). Code: https://github.com/bahree/ModelAdaptationBook

All are adaptations of Qwen/Qwen3-4B-Instruct-2507 on a real IT-support dataset: Stack Exchange IT Q&A (Super User, Ask Ubuntu, Server Fault; CC-BY-SA-4.0) plus a small Databricks Dolly slice (CC-BY-SA-3.0) for general-capability retention. Each chapter's artifact is a subfolder, so you can follow along on any machine (including Apple Silicon) by pulling a trained model and running inference/eval, without training it yourself.

Subfolder Chapter What Base
ch5-lora 5 LoRA adapter Qwen3-4B-Instruct-2507
ch6-sft 6 full SFT model (standalone) (full fine-tune)
ch7-distilled 7 distilled student (LoRA) Qwen3-4B-Instruct-2507
ch8-dpo 8 full DPO model (standalone) (full fine-tune)
ch8-dpo-lora 8 LoRA-DPO adapter (single-card path) ch6-sft

Load a full model:

from transformers import AutoModelForCausalLM
m = AutoModelForCausalLM.from_pretrained("bahree/ModelAdaptationBook", subfolder="ch6-sft")

Load an adapter (on its base):

from transformers import AutoModelForCausalLM
from peft import PeftModel
base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-4B-Instruct-2507")
m = PeftModel.from_pretrained(base, "bahree/ModelAdaptationBook", subfolder="ch5-lora")

Training these needs a CUDA 24 GB+ GPU (and the Ch8 full DPO uses multiple GPUs; the ch8-dpo-lora adapter is the single-card alternative). Inference and evaluation fit a single smaller GPU or Apple Silicon (MPS). See the book repo for exact commands, datasets, and full attribution.