PeterLauLukCh's picture
Update README.md
893361c verified
metadata
license: mit
datasets:
  - trl-lib/ultrafeedback_binarized
base_model:
  - alignment-handbook/zephyr-7b-sft-full

DPO model excluding the noisy preference pairs for Mistral-Base under trl/ultradeedback_binarized finetuning.