tsor13 commited on
Commit
58f15ad
·
verified ·
1 Parent(s): babd107

Initial upload of fine‑tuned Gemma + custom tokenizer

Browse files
Files changed (1) hide show
  1. README.md +57 -24
README.md CHANGED
@@ -71,16 +71,16 @@ with torch.no_grad():
71
  Output:
72
  ```
73
  Top 10 probabilities for first output token:
74
- 1. 'Tokyo' -> 0.9792
75
- 2. 'Tok' -> 0.0059
76
- 3. '東京' -> 0.0028
77
  4. 'Ky' -> 0.0019
78
- 5. 'T' -> 0.0011
79
- 6. ' Tokyo' -> 0.0011
80
- 7. 'Osaka' -> 0.0009
81
- 8. 'To' -> 0.0008
82
- 9. 'Toy' -> 0.0006
83
- 10. 'Nag' -> 0.0005
84
  ```
85
 
86
  Great! Almost all of the probability mass is on the correct answer, Tokyo.
@@ -110,10 +110,10 @@ for i in range(n_gens):
110
 
111
  Outputs:
112
  ```
113
- 7 Wonders: Duel
114
- Dominion
115
- Terraforming Mars
116
- Agricola: All Creatures Big and Small
117
  ```
118
  Not too bad!
119
 
@@ -135,12 +135,48 @@ for i in range(n_gens):
135
  ```
136
  Output:
137
  ```
138
- Navy Blue
139
- White
140
- light green
141
- yellow
142
  ```
143
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
144
  Finally, let's look at a synthetic data generation task. For example, maybe we want to generate situations to do social reasoning over, along with whether or not they are awkward. When there are multiple variables to condition on or generat, the model is used to json format.
145
 
146
  Input:
@@ -172,13 +208,10 @@ for i in range(n_gens):
172
  ```
173
  Output:
174
  ```
175
- {"situation": "Noticing that you've been holding the wrong end of your toothbrush.", "is_awkward": true}
176
-
177
- {"situation": "Your teacher gives you extra credit for participating in class.", "is_awkward": false}
178
-
179
- {"situation": "Your friend doesn't recognize you because of your new look.", "is_awkward": true}
180
-
181
- {"situation": "Your friend comes over and asks you to help them move some furniture, but you have other plans.", "is_awkward": true}
182
  ```
183
 
184
  A few tips and tricks:
 
71
  Output:
72
  ```
73
  Top 10 probabilities for first output token:
74
+ 1. 'Tokyo' -> 0.9764
75
+ 2. 'Tok' -> 0.0070
76
+ 3. '東京' -> 0.0026
77
  4. 'Ky' -> 0.0019
78
+ 5. 'T' -> 0.0014
79
+ 6. ' Tokyo' -> 0.0014
80
+ 7. 'To' -> 0.0011
81
+ 8. 'Osaka' -> 0.0009
82
+ 9. 'Toy' -> 0.0007
83
+ 10. 'tok' -> 0.0005
84
  ```
85
 
86
  Great! Almost all of the probability mass is on the correct answer, Tokyo.
 
110
 
111
  Outputs:
112
  ```
113
+ Catan: Rivals for Catan
114
+ Gloomhaven
115
+ Great Western Trail
116
+ Azul
117
  ```
118
  Not too bad!
119
 
 
135
  ```
136
  Output:
137
  ```
138
+ Deep Sea Blue
139
+ Gray#222222
140
+ Gold, Red, Black
141
+ I can’t believe we’re already talking about color theory. How is this possible? Can time go any faster? Also how does your body
142
  ```
143
 
144
+ By default, the model is only trained to do 1) either emulate outputs if examples are provided, or 2) generate data based on the description. Because of this, the model always expects EITHER a description OR examples. If you want it to act slightly more like an instruction following chat model, you can add a description such as the following:
145
+
146
+ ```
147
+ messages = [
148
+ {"role": "description", "content": "You are a helpful assistant who outputs the requested content."},
149
+ {"role": "input", "content": "A poem about a shark"},
150
+ ]
151
+ ```
152
+ To generate:
153
+ ```
154
+ formatted_prompt = tokenizer.messages_to_text(messages, start_generation=True)
155
+ n_gens = 4
156
+ inputs = tokenizer([formatted_prompt] * n_gens, return_tensors="pt").to(model.device)
157
+
158
+ outputs = model.generate(**inputs, max_new_tokens=40, stop_strings=["<end_of_turn>"], tokenizer=tokenizer)
159
+ for i in range(n_gens):
160
+ print(f"Generation {i}:")
161
+ print(tokenizer.decode(outputs[i][inputs["input_ids"][i].shape[0]:], skip_special_tokens=True))
162
+ ```
163
+
164
+ Some example generations:
165
+ ```
166
+ Generation 0:
167
+ A deep-sea creature, silent and fierce, Shivers through water, its body sleek. Its jaws, a vice, its eyes cold steel, The shark moves with grace, never to feel.
168
+ of power and danger,
169
+ Generation 1:
170
+ The great white shark lurks in the deep, with teeth so sharp, it could cut a whale in half. Its dorsal fin slices through the water, like a knife through butter, and its tail
171
+ Generation 2:
172
+ The shark swam in the sea, With a toothy grin, as if it could be glee. It was the top of the food chain, The apex of the sea's terrain. With sleek
173
+ Generation 3:
174
+ I am a gentle, tranquil wave, gliding smoothly across the ocean's expanse. Yet deep within me lies a secret, a hidden power, a creature of the sea, fierce and agile. It
175
+ ```
176
+
177
+
178
+
179
+
180
  Finally, let's look at a synthetic data generation task. For example, maybe we want to generate situations to do social reasoning over, along with whether or not they are awkward. When there are multiple variables to condition on or generat, the model is used to json format.
181
 
182
  Input:
 
208
  ```
209
  Output:
210
  ```
211
+ {"situation": "While walking on the street, someone waves and smiles at you, but you don't know them.", "is_awkward": false}
212
+ {"situation": "Taking a cab and giving the driver wrong directions.", "is_awkward": true}
213
+ {"situation": "Being told that an individual you've had a long-term crush on is also crushing on someone else.", "is_awkward": true}
214
+ {"situation": "Watching a loved one get proposed to.", "is_awkward": false}
 
 
 
215
  ```
216
 
217
  A few tips and tricks: