Training in progress, step 100

Files changed (3) hide show

README.md CHANGED Viewed

@@ -28,7 +28,7 @@ print(output["generated_text"])
 ## Training procedure
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/maymn0535-none/huggingface/runs/rnwpzpnd)
 This model was trained with DPO, a method introduced in [Direct Preference Optimization: Your Language Model is Secretly a Reward Model](https://huggingface.co/papers/2305.18290).

 ## Training procedure
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/maymn0535-none/huggingface/runs/ic3pxgaa)
 This model was trained with DPO, a method introduced in [Direct Preference Optimization: Your Language Model is Secretly a Reward Model](https://huggingface.co/papers/2305.18290).

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6578cecfe50e7d3a20d040c9b1cdda32c58f9cfcb54fdc5334ee95fe69c2eddf
 size 204500912

 version https://git-lfs.github.com/spec/v1
+oid sha256:96db5154f98f00e2835e99ed5a8fbdf293d8f63243a6f707c81db39c6c06f0a2
 size 204500912

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:fa5979d784b3be5f03398730b0db9a0aaad24ae1fdea10accf8ecc4f7c831b44
 size 6289

 version https://git-lfs.github.com/spec/v1
+oid sha256:d0e9844d94b2ddeacda52d988f72cd6b4206ea325ad209511ab311437d2b42ef
 size 6289