dball
/

zephyr-7b-sft-qlora

alignment-handbook

Generated from Trainer

4-bit precision

Model card Files Files and versions

Metrics Training metrics Community

Resources

View closed (0)

Adding Evaluation Results

#2 opened about 2 years ago by

leaderboard-pr-bot

Is the drop in many metrics expected? Why do SFT first if it makes the model worse? Why not do DPO directly on the mistral model?

#1 opened over 2 years ago by