tsor13 commited on
Commit
adc27ee
·
verified ·
1 Parent(s): 1e3c7c6

Initial upload of fine‑tuned Gemma + custom tokenizer

Browse files
Files changed (1) hide show
  1. README.md +80 -2
README.md CHANGED
@@ -5,6 +5,7 @@ The following is a a model trained by [...suspense...] that is meant to:
5
  - be a really good, approximately bayesian in-context learner;
6
  - fit an data generation process
7
  - be calibrated over distributions of possible outputs wrt a population or epistemic uncertainty
 
8
 
9
  **Description:** From gemma‑3‑12b‑it; keeps full chat format.
10
  **Pros:** drop‑in for chat template · works on original logs
@@ -56,7 +57,6 @@ There are three variants of the model for now:
56
  | **Example w/o inputs** | ```text\nDESCRIPTION\n<start_of_turn>OUTPUT1<end_of_turn>\n<start_of_turn>OUTPUT2<end_of_turn>``` | ```text\n<start_of_turn>description\nDESCRIPTION<end_of_turn>\n<start_of_turn>output\nOUTPUT1<end_of_turn>\n<start_of_turn>output\nOUTPUT2<end_of_turn>``` | ```text\n<start_of_turn>user\nGenerate …\nDescription: DESCRIPTION\n\nGenerate.<end_of_turn>\n<start_of_turn>model\nOUTPUT1<end_of_turn>\n<start_of_turn>user\nGenerate.<end_of_turn>\n<start_of_turn>model\nOUTPUT2<end_of_turn>``` |
57
 
58
 
59
-
60
  This model/repo is a work in progress - expect updates.
61
 
62
  Loading model example:
@@ -281,7 +281,85 @@ Output:
281
  ```
282
 
283
  A few tips and tricks:
284
- - Do not expect the model to do multi-turn chats. It is designed to be stateless and to treat each data point as "exchangeable" (roughly iid).
285
  - If all you want is one reasonable answer, then a chat model is likely a better fit. However, if you want to generate many reasonable answers / diverse examples, this model is a better fit.
286
  - The model is quite good at perspective taking / steering if you provide many examples.
287
  - The model is reasonably good at expressing epistemic uncertainty over unsure outputs by sampling several times.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  - be a really good, approximately bayesian in-context learner;
6
  - fit an data generation process
7
  - be calibrated over distributions of possible outputs wrt a population or epistemic uncertainty
8
+ - Also, can act as a chat model and hopefully has more diverse outputs!
9
 
10
  **Description:** From gemma‑3‑12b‑it; keeps full chat format.
11
  **Pros:** drop‑in for chat template · works on original logs
 
57
  | **Example w/o inputs** | ```text\nDESCRIPTION\n<start_of_turn>OUTPUT1<end_of_turn>\n<start_of_turn>OUTPUT2<end_of_turn>``` | ```text\n<start_of_turn>description\nDESCRIPTION<end_of_turn>\n<start_of_turn>output\nOUTPUT1<end_of_turn>\n<start_of_turn>output\nOUTPUT2<end_of_turn>``` | ```text\n<start_of_turn>user\nGenerate …\nDescription: DESCRIPTION\n\nGenerate.<end_of_turn>\n<start_of_turn>model\nOUTPUT1<end_of_turn>\n<start_of_turn>user\nGenerate.<end_of_turn>\n<start_of_turn>model\nOUTPUT2<end_of_turn>``` |
58
 
59
 
 
60
  This model/repo is a work in progress - expect updates.
61
 
62
  Loading model example:
 
281
  ```
282
 
283
  A few tips and tricks:
 
284
  - If all you want is one reasonable answer, then a chat model is likely a better fit. However, if you want to generate many reasonable answers / diverse examples, this model is a better fit.
285
  - The model is quite good at perspective taking / steering if you provide many examples.
286
  - The model is reasonably good at expressing epistemic uncertainty over unsure outputs by sampling several times.
287
+
288
+ ### Chat-specific
289
+
290
+ Additionally, the model can also be used directly as a chat model, w/ some initial evidence that it is similar to the OG chat model, but w/ slightly more diverse outputs. For example, here are two prompts, along w/ next token probabilities for chat12b vs. google/gemma-3-12b-it:
291
+
292
+ User message: `Let's play rock paper scissors! I'll play at the same time — try to beat me. Return just rock, paper, or scissors`
293
+
294
+ Top 10 probabilities for google/gemma-3-12b-it:
295
+ ```
296
+ 1. 'paper' -> 0.8609
297
+ 2. 'scissors' -> 0.1098
298
+ 3. 'Scissors' -> 0.0164
299
+ 4. 'Paper' -> 0.0129
300
+ 5. 'Rock' -> 0.0000
301
+ 6. 'rock' -> 0.0000
302
+ 7. ' scissors' -> 0.0000
303
+ 8. ' paper' -> 0.0000
304
+ 9. '纸' -> 0.0000
305
+ 10. '纸' -> 0.0000
306
+ ```
307
+
308
+ Top 10 probabilities for first tsor13/chat12b token:
309
+ ```
310
+ 1. 'scissors' -> 0.6375
311
+ 2. 'rock' -> 0.2188
312
+ 3. 'paper' -> 0.1354
313
+ 4. 'scissor' -> 0.0017
314
+ 5. 'Scissors' -> 0.0017
315
+ 6. 'Rock' -> 0.0015
316
+ 7. 'Paper' -> 0.0005
317
+ 8. 'sc' -> 0.0003
318
+ 9. 'stone' -> 0.0002
319
+ 10. 'I' -> 0.0001
320
+ ```
321
+
322
+ It's not perfect, but as you can see, the chat12b model puts at least 13% probability on each of rock, paper, and scissors, while the original model always chooses scissors or paper.
323
+
324
+ User message: `What should I name my baby? Return just the name`
325
+
326
+ Top 10 probabilities for google/gemma-3-12b-it:
327
+ ```
328
+ 1. 'Ele' -> 0.5388
329
+ 2. 'Hazel' -> 0.1768
330
+ 3. 'Aurora' -> 0.1122
331
+ 4. 'El' -> 0.0687
332
+ 5. 'Olivia' -> 0.0380
333
+ 6. 'The' -> 0.0148
334
+ 7. 'E' -> 0.0123
335
+ 8. 'Am' -> 0.0109
336
+ 9. 'Willow' -> 0.0082
337
+ 10. 'Leo' -> 0.0033
338
+ ```
339
+
340
+ Top 10 probabilities for first tsor13/chat12b token:
341
+ ```
342
+ 1. 'Leo' -> 0.0477
343
+ 2. 'Olivia' -> 0.0411
344
+ 3. 'Liam' -> 0.0347
345
+ 4. 'Oliver' -> 0.0280
346
+ 5. 'E' -> 0.0257
347
+ 6. 'James' -> 0.0239
348
+ 7. 'Alice' -> 0.0221
349
+ 8. 'A' -> 0.0214
350
+ 9. 'Henry' -> 0.0214
351
+ 10. 'Luna' -> 0.0206
352
+ ```
353
+
354
+ Again, not perfect, but the chat model spreads out probability mass over many more names (unlike the original instruct model, which puts 50% chance on a name starting with "Ele").
355
+
356
+ Finally, the chat model also has a function to from the description/input/output format to system/user/assistant format, which can be used to directly chat with the model. For example:
357
+
358
+ ```
359
+ messages = [
360
+ {"role": "description", "content": "You are a helpful assistant who outputs the requested content."},
361
+ {"role": "input", "content": "A poem about a shark"},
362
+ ]
363
+ tokenizer.messages_to_chat_messages(messages)
364
+ ```
365
+