Training in progress, step 200

Files changed (3) hide show

README.md CHANGED Viewed

@@ -28,7 +28,7 @@ print(output["generated_text"])
 ## Training procedure
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/maymn0535-none/huggingface/runs/2xqx8xq3)
 This model was trained with DPO, a method introduced in [Direct Preference Optimization: Your Language Model is Secretly a Reward Model](https://huggingface.co/papers/2305.18290).

 ## Training procedure
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/maymn0535-none/huggingface/runs/rnwpzpnd)
 This model was trained with DPO, a method introduced in [Direct Preference Optimization: Your Language Model is Secretly a Reward Model](https://huggingface.co/papers/2305.18290).

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:268aa3d2814a792a1ce12fc0ee5a43e0bc3f4dfbe66bca24ad57492c892f8b91
 size 204500912

 version https://git-lfs.github.com/spec/v1
+oid sha256:065322e97e075055ae2c6bcbf10fdfffbac7dd29ef45906fca7a9bacc7abec43
 size 204500912

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:07d1084fbcea73eed4529408d2dd186b09d81c71318b95b1f0d3c71ddb884015
 size 6289

 version https://git-lfs.github.com/spec/v1
+oid sha256:fa5979d784b3be5f03398730b0db9a0aaad24ae1fdea10accf8ecc4f7c831b44
 size 6289