HuggingFaceH4/ultrachat_200k
Viewer • Updated • 515k • 72.1k • 708
How to use dball/zephyr-7b-sft-qlora with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-v0.1")
model = PeftModel.from_pretrained(base_model, "dball/zephyr-7b-sft-qlora")This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on the HuggingFaceH4ultrachat_200k dataset. It is the first step (Step 1 SFT, see below) of building Zephyr, i.e. before DPO. It achieves the following results on the evaluation set:
QLoRA SFT via
# Step 1 - SFT
ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/multi_gpu.yaml --num_processes=1 scripts/run_sft.py recipes/zephyr-7b-beta/sft/config_qlora.yaml --load_in_4bit=true
see https://github.com/huggingface/alignment-handbook/blob/main/recipes/zephyr-7b-beta/README.md
chat_template: "{% for message in messages %}\n{% if message['role'] == 'user' %}\n{{ '<|user|>\n' + message['content'] + eos_token }}\n{% elif message['role'] == 'system' %}\n{{ '<|system|>\n' + message['content'] + eos_token }}\n{% elif message['role'] == 'assistant' %}\n{{ '<|assistant|>\n' + message['content'] + eos_token }}\n{% endif %}\n{% if loop.last and add_generation_prompt %}\n{{ '<|assistant|>' }}\n{% endif %}\n{% endfor %}"
dataset_mixer:
HuggingFaceH4/ultrachat_200k: 1.0
dataset_splits:
- train_sft
- test_sft
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 0.913 | 1.0 | 17428 | 0.9523 |
Base model
mistralai/Mistral-7B-v0.1