shirwu commited on
Commit
d511e3a
·
verified ·
1 Parent(s): 5390860

Model save

Browse files
Files changed (1) hide show
  1. README.md +3 -5
README.md CHANGED
@@ -1,7 +1,6 @@
1
  ---
2
  base_model: meta-llama/Llama-3.1-8B-Instruct
3
- datasets: snap-stanford/preference_iterative_hard-answer_generator-iter0
4
- library_name: peft
5
  model_name: output
6
  tags:
7
  - generated_from_trainer
@@ -12,7 +11,7 @@ licence: license
12
 
13
  # Model Card for output
14
 
15
- This model is a fine-tuned version of [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) on the [snap-stanford/preference_iterative_hard-answer_generator-iter0](https://huggingface.co/datasets/snap-stanford/preference_iterative_hard-answer_generator-iter0) dataset.
16
  It has been trained using [TRL](https://github.com/huggingface/trl).
17
 
18
  ## Quick start
@@ -28,14 +27,13 @@ print(output["generated_text"])
28
 
29
  ## Training procedure
30
 
31
-
32
 
33
 
34
  This model was trained with Reward.
35
 
36
  ### Framework versions
37
 
38
- - PEFT 0.14.0
39
  - TRL: 0.14.0
40
  - Transformers: 4.48.2
41
  - Pytorch: 2.5.1
 
1
  ---
2
  base_model: meta-llama/Llama-3.1-8B-Instruct
3
+ library_name: transformers
 
4
  model_name: output
5
  tags:
6
  - generated_from_trainer
 
11
 
12
  # Model Card for output
13
 
14
+ This model is a fine-tuned version of [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct).
15
  It has been trained using [TRL](https://github.com/huggingface/trl).
16
 
17
  ## Quick start
 
27
 
28
  ## Training procedure
29
 
30
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/dsp-team/optimas/runs/y3oacxbo)
31
 
32
 
33
  This model was trained with Reward.
34
 
35
  ### Framework versions
36
 
 
37
  - TRL: 0.14.0
38
  - Transformers: 4.48.2
39
  - Pytorch: 2.5.1