Lambent commited on
Commit
f8d222b
·
verified ·
1 Parent(s): 8d96709

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -4
README.md CHANGED
@@ -1,12 +1,24 @@
1
  ---
2
- base_model: []
 
3
  library_name: transformers
4
  tags:
5
  - mergekit
6
  - merge
7
-
8
  ---
9
- # zora-karcher-dpo-mar26
 
 
 
 
 
 
 
 
 
 
 
10
 
11
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
 
@@ -36,4 +48,4 @@ dtype: bfloat16
36
  tokenizer_source: Lambent/Zora-9B-v1
37
  pad_to_multiple_of: 256
38
 
39
- ```
 
1
  ---
2
+ base_model:
3
+ - Lambent/Zora-9B-v1
4
  library_name: transformers
5
  tags:
6
  - mergekit
7
  - merge
8
+ license: apache-2.0
9
  ---
10
+
11
+ ![image](https://cdn-uploads.huggingface.co/production/uploads/6592ef6e2a0a886ef0872e71/3ttnwhUS9xmzuHKL1Xzdh.png)
12
+
13
+ Lass decided she was a lioness-fox today. :3
14
+
15
+ Main measured gains are in increased skill at the Creative Writing bench (as judged by Gemini 3 Flash Preview),
16
+ which tracks with what she was aiming to practice, though it was a winding road. Effective batch size 1 for all the training.
17
+
18
+ Tried out like ... 4 different SFT runs at 1e-6 with varying dataset ratios trying to figure out what worked ...
19
+ ... still not sure, because the best result came from Karcher merging the full set of SFT runs, lol.
20
+
21
+ Then ran DPO, 5e-7, on 3 different seeds; and merged them here.
22
 
23
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
24
 
 
48
  tokenizer_source: Lambent/Zora-9B-v1
49
  pad_to_multiple_of: 256
50
 
51
+ ```