hyxmmm commited on
Commit
dde8539
·
verified ·
1 Parent(s): 5654d7b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +79 -1
README.md CHANGED
@@ -4,4 +4,82 @@ datasets:
4
  language:
5
  - en
6
  base_model: google/gemma-2-2b-it
7
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  language:
5
  - en
6
  base_model: google/gemma-2-2b-it
7
+ ---
8
+
9
+ ## Overview
10
+ Gemma2-9B-IT-Simpo-Infinity-Preference is based on [gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it) and finetuned on Infinity-Preference with [Simpo](https://github.com/princeton-nlp/SimPO). It achieves 73.4% LC win-rate on AlpacaEval 2.0 and 58.1% win-rate on Arena-Hard against GPT-4.
11
+
12
+ ## Training hyperparameters
13
+
14
+ ```yaml
15
+ beta: 10
16
+ gamma_beta_ratio: 1
17
+ learning_rate: 8.0e-7
18
+ log_level: info
19
+ logging_steps: 5
20
+ max_length: 2048
21
+ max_prompt_length: 1800
22
+ num_train_epochs: 1
23
+ batch_size: 128
24
+ ```
25
+
26
+ ## How to Use
27
+
28
+ Gemma2-9B-IT-Simpo-Infinity-Preference adopt the same chat template of [gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it):
29
+
30
+ ```bash
31
+ <bos><start_of_turn>user
32
+ How are you?<end_of_turn>
33
+ <start_of_turn>model
34
+ Hi!<end_of_turn>
35
+ ```
36
+
37
+ To apply this model and template in conversation scenarios, you can refer to the following code:
38
+
39
+ ```python
40
+ from transformers import AutoModelForCausalLM, AutoTokenizer, LogitsProcessorList
41
+ import torch
42
+ device = "cuda" # the device to load the model onto
43
+
44
+ model = AutoModelForCausalLM.from_pretrained("BAAI/Gemma2-9B-IT-Simpo-Infinity-Preference",
45
+ torch_dtype=torch.bfloat16,
46
+ device_map="auto"
47
+ )
48
+ tokenizer = AutoTokenizer.from_pretrained("BAAI/Infinity-Instruct-3M-0625-Llama3-70B")
49
+
50
+ prompt = "Give me a short introduction to large language model."
51
+ messages = [
52
+ {"role": "user", "content": prompt}
53
+ ]
54
+
55
+ text = tokenizer.apply_chat_template(
56
+ messages,
57
+ tokenize=False,
58
+ add_generation_prompt=True
59
+ )
60
+ model_inputs = tokenizer([text], return_tensors="pt").to(device)
61
+
62
+ logits_processor = LogitsProcessorList(
63
+ [
64
+ MinLengthLogitsProcessor(1, eos_token_id=tokenizer.eos_token_id),
65
+ TemperatureLogitsWarper(0.8),
66
+ ]
67
+ )
68
+
69
+ generated_ids = model.generate(
70
+ model_inputs.input_ids,
71
+ logits_processor=logits_processor,
72
+ max_new_tokens=512
73
+ )
74
+
75
+ generated_ids = [
76
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
77
+ ]
78
+
79
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
80
+ print(response)
81
+ ```
82
+
83
+ ## Disclaimer
84
+
85
+ The resources, including code, data, and model weights, associated with this project are restricted for academic research purposes only and cannot be used for commercial purposes. The content produced by any version of Infinity-Preference is influenced by uncontrollable variables such as randomness, and therefore, the accuracy of the output cannot be guaranteed by this project. This project does not accept any legal liability for the content of the model output, nor does it assume responsibility for any losses incurred due to the use of associated resources and output results.