Tandogan
/

MNLP_M3_dpo_model

Text Generation

text-generation-inference

Model card Files Files and versions

Tandogan commited on Jun 9, 2025

Commit

fe6a9f2

·

verified ·

1 Parent(s): a665dc6

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -3,7 +3,7 @@ library_name: transformers
 tags: []
 ---
-# MNLP M3 DPO Model — Qwen3-0.6B Fine-Tuned with Direct Preference Optimization
 This repository contains a Direct Preference Optimization (DPO) model built on top of the base model [`Qwen/Qwen3-0.6B-Base`](https://huggingface.co/Qwen/Qwen3-0.6B-Base), as part of the MNLP M3 project. The model is fine-tuned using a high-quality preference dataset to better align responses with human preferences.

 tags: []
 ---
+# MNLP M3 DPO Model — Qwen3-0.6B-Base Fine-Tuned with Direct Preference Optimization
 This repository contains a Direct Preference Optimization (DPO) model built on top of the base model [`Qwen/Qwen3-0.6B-Base`](https://huggingface.co/Qwen/Qwen3-0.6B-Base), as part of the MNLP M3 project. The model is fine-tuned using a high-quality preference dataset to better align responses with human preferences.