| license: apache-2.0 | |
| language: | |
| - en | |
| # 4season/model_eval_test | |
| # **Introduction** | |
| This model is test version, alignment-tuned model. | |
| We utilize state-of-the-art instruction fine-tuning methods including direct preference optimization (DPO). | |
| After DPO training, we linearly merged models to boost performance. |