File size: 7,200 Bytes
69ae844
 
 
 
 
 
6d45dcd
69ae844
 
 
 
 
 
 
 
 
 
6d45dcd
69ae844
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6d45dcd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
---
base_model: []
library_name: transformers
tags:
- mergekit
- merge
license: apache-2.0
---

# Antler 7B Evolve

<img src="https://huggingface.co/Elizezen/Antler-7B/resolve/main/OIG3.UAjshTXCEJU.jpg" alt="drawing" style="width:512px;"/>

## Model Description

This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit), using **Evolutionary Model Merging**.

Generally better than Antler-7B at writing novels, especially at maintaining context, but can fall short on eroticism compared to the original model. It also tends to generate eos tokens quite early, which I'm currently working on improving.
## Example

### Input

```
ใใฎๆ—ฅใฎๆ˜ผไธ‹ใŒใ‚Šใ€‚็งใจใ‚ใ‚„ใฏใŠๆƒใ„ใฎๆตด่กฃใ‚’่บซใซ็บใ„ใ€ๅ†ฌ็ฅญใ‚Šใ‚’ๆฅฝใ—ใ‚€ใŸใ‚ใซ็”บใธใจ็นฐใ‚Šๅ‡บใ—ใฆใ„ใŸใ€‚ใ‚€ใ‚ใ‚“ใ€ๅฟ่€…ใฎ็ด ๆ€งใ‚’้š ใ™ใŸใ‚ใซ็š†ๅค‰่ฃ…ใ—ใฆใ„ใ‚‹ใ€‚
ๆ™ฎๆฎต็€ๆ…ฃใ‚Œใชใ„ๆœ่ฃ…ใฎใŸใ‚ๅฐ‘ใ€…่ฝใก็€ใ‹ใชใ„ใ‚‚ใฎใฎใ€ๅธฏใŒ็ทฉใ‚“ใ ใ‚Š็€ๅดฉใ‚Œใ—ใชใ„ใ‚ˆใ†ๆ„่ญ˜ใ—ใชใŒใ‚‰ๆญฉใใ€‚ใ™ใ‚‹ใจใ‚ใ‚„ใ‚‚ๅŒใ˜ใ‚ˆใ†ใซใŽใ“ใกใชใๆญฉใ„ใฆใ„ใ‚‹ใฎใŒๅˆ†ใ‹ใฃใŸใ€‚
ใ‚„ใŒใฆ่ก—ไธฆใฟใฏๆดปๆฐ—ใซๆบ€ใกๅง‹ใ‚ใ€้“่กŒใไบบใ€…ใฎ่ณ‘ใ‚„ใ‹ใชๅฃฐใŒ่žใ“ใˆใฆใใ‚‹ใ€‚
ๅบƒๅ ดใซๅˆฐ็€ใ™ใ‚‹ใจใ€ใใ“ใฏๅคงๅ‹ขใฎไบบใ€…ใงใซใŽใ‚ใ„ใ€่‰ฒใจใ‚Šใฉใ‚Šใฎๆ็ฏใŒ่พบใ‚Šใ‚’็…งใ‚‰ใ—ใฆใ„ใŸใ€‚ๆง˜ใ€…ใชๅ‡บๅบ—ใŒไธฆใณใ€ๅคงๅ‹ขใฎๅญไพ›้”ใจใใฎ่ฆชๅพกใ•ใ‚“ใŒ้ง†ใ‘ๅ›žใฃใฆใ„ใ‚‹ใ€‚
ๅบƒๅ ดใฎไธญๅคฎไป˜่ฟ‘ใซใ‚ใ‚‹่ˆžๅฐใงใฏๅ‚ฌใ—็‰ฉใŒ้–‹ๅ‚ฌใ•ใ‚ŒใฆใŠใ‚Šใ€ๅคช้ผ“ใฎ้Ÿณใซๅˆใ‚ใ›ใฆๆญŒใฃใฆใ„ใ‚‹ๆผ”่€…ใŸใกใŒใ„ใŸใ€‚
ใ€Œใ‚ใ๏ฝžใ€ใใ‚Œใ„๏ผใ€
็›ฎใ‚’่ผใ‹ใ›ใฆ่พบใ‚Šใ‚’่ฆ‹ๅ›žใ™ใ‚ใ‚„ใ€‚ใ“ใ†ใ—ใฆใฟใ‚‹ใจๅนด็›ธๅฟœใฎๅญใฉใ‚‚ใซ่ฆ‹ใˆใ‚‹ใ€‚
ใ€Œใ“ใ‚‰ใ€ๅ‹ๆ‰‹ใซ่ตฐใ‚Šๅ›žใ‚‰ใชใ„ใ€
ใ€Œใˆใธใธ๏ฝžใ”ใ‚ใ‚“ใชใ•ใ„ใ€
ใŸใ—ใชใ‚ใ‚‰ใ‚ŒใชใŒใ‚‰ใ‚‚ใ€้ก”ใฏ็ถปใ‚“ใงใ„ใ‚‹ๆง˜ๅญใ‹ใ‚‰ใ‚‚ๅˆ†ใ‹ใ‚‹้€šใ‚Šใ€ๅฝผๅฅณใ‚‚ๆฅฝใ—ใฟใซใ—ใฆใ„ใ‚‹ใฎใฏๆ˜Žใ‚‰ใ‹ใ ใ‚ใ†ใ€‚
ใ‚ใ‚„ใŒๆฅฝใ—ใใ†ใ ใจใ€็งใ‚‚ๅฌ‰ใ—ใ„ใ€‚ไธๆ€่ญฐใชใ‚‚ใฎใ ใชใ€‚ไปŠใพใงใ“ใ‚“ใชๆฐ—ๆŒใกใซใชใฃใŸใ“ใจใฏใชใ‹ใฃใŸใ€‚
ๆ€ใ‚ใš็งใพใง็ฌ‘้ก”ใซใชใฃใฆใ—ใพใ†ใ€‚
ใ€Œใ“ใฎใกใ‚ƒใ‚“ใ€ใ“ใฎใกใ‚ƒใ‚“๏ผ่ฆ‹ใฆ๏ผใ“ใฎใกใ‚‡ใ†ใกใ‚“ๅฏๆ„›ใ„๏ฝž๏ผใ€
ใ€ŒใŠใŠใ€
ๅฝผๅฅณใซ่ข–ใ‚’ๅผ•ใ‹ใ‚Œใฆ้€ฃใ‚Œใฆ่กŒใ‹ใ‚ŒใŸใฎใฏใ€ๅฏๆ„›ใ‚‰ใ—ใ„ๅ‹•็‰ฉใฎ็ตตๆŸ„ใŒใ‚ใ—ใ‚‰ใ‚ใ‚ŒใŸ่กŒ็‡ˆใ€‚ใฉใ†ใ‚„ใ‚‰ๅฃฒๅบ—ใŒๅ‡บใ—ใฆใ„ใ‚‹ใ‚‰ใ—ใ„ใ€‚
ๅฑ‹ๅฐใงใฏ
```

### Output

```
ๅฑ‹ๅฐใงใฏ็„ผใใใฐใ‚„็„ผใ้ณฅใชใฉใฎๅฎš็•ชใจไธฆใ‚“ใงใ€ๅ†ฌ้™ๅฎšใฎ้‹ๅฐ‚้–€ๅบ—ใชใฉใ‚‚ๅ‡บใฆใ„ใŸใ€‚
ใ€Œใ‚ใ‚„ใฏใชใซใ‹้ฃŸในใŸใ„ใ‚‚ใฎใ‚ใ‚‹ใ‹๏ผŸใ€
ใ€Œใˆใฃใจใญ๏ฝžใ€ใŠๅฅฝใฟ็„ผใใจใ‹โ€ฆโ€ฆใ€
ใ€ŒใŠๅฅฝใฟ็„ผใ๏ผŸใ€
็งใฏๅˆใ‚ใฆ่žใๆ–™็†ๅใซ้ฆ–ใ‚’ๅ‚พใ’ใ‚‹ใ€‚ใ‚ใ‚„ใฏ็›ฎใ‚’ใ‚ญใƒฉใ‚ญใƒฉใ•ใ›ใชใŒใ‚‰ใ€ใใฎๆ–™็†ใซใคใ„ใฆ่ชฌๆ˜Žใ—ใฆใใ‚ŒใŸใ€‚
ๅฐ้บฆ็ฒ‰ใฎ็”Ÿๅœฐใซใ‚ญใƒฃใƒ™ใƒ„ใ‚„่ฑš่‚‰ใ€ๅคฉใ‹ใ™ใ€ใใ—ใฆใŠๅฅฝใฟ็„ผใใ‚ฝใƒผใ‚นใ‚’ใ‹ใ‘ใฆ็„ผใ„ใŸใ€ๅคง้˜ชๅ็‰ฉใฎๆ–™็†ใ‚‰ใ—ใ„ใ€‚
ใ€Œใใ‚ŒใฏใพใŸ้ข็™ฝใใ†ใชใ‚‚ใฎใ ใชใ€‚ใงใฏใใ“ใฎๅฑ‹ๅฐใซ่กŒใฃใฆใฟใ‚ˆใ†ใ€
็ง้”ใฏ็›ฎๆ˜Ÿใ‚’ใคใ‘ใŸๅฑ‹ๅฐใธๅ‘ใ‹ใ†ใ“ใจใซใ—ใŸใ€‚
ใŠๅฅฝใฟ็„ผใใฎๅฑ‹ๅฐใฏใ€ไบˆๆƒณไปฅไธŠใฎ็››ๆณใถใ‚Šใง่กŒๅˆ—ใŒใงใใฆใ„ใŸใ€‚ใ—ใ‹ใ—ใ€ไธฆใถใ“ใจ30ๅˆ†ใปใฉใง็งใŸใกใฎ็•ชใŒใ‚„ใฃใฆใใ‚‹ใ€‚
ใ€ŒใŠใฐใกใ‚ƒใ‚“ใ€ใ“ใ‚Œใใ ใ•ใ„ใ€
ใ€Œใ‚ใ„ใ‚ˆ๏ผใกใ‚‡ใฃใจๅพ…ใฃใฆใช๏ผใ€
ๅฑ‹ๅฐใฎใŠใฐใกใ‚ƒใ‚“ใฏๅจๅ‹ขใฎใ„ใ„ๅฃฐใง่ฟ”ไบ‹ใ‚’ใ™ใ‚‹ใจใ€ๆ‰‹ๆ…ฃใ‚ŒใŸๆง˜ๅญใง้‰„ๆฟใฎไธŠใงใŠๅฅฝใฟ็„ผใใ‚’็„ผใไธŠใ’ใ‚‹ใ€‚
ใ€Œใ“ใ‚ŒใŒใŠๅฅฝใฟ็„ผใใ ใ‚ˆใ€
ๅ‡บๆฅไธŠใŒใฃใŸใŠๅฅฝใฟ็„ผใใ‚’ๆ‰‹ใซใ—ใŸใ‚ใ‚„ใŒใ€ใ†ใฃใจใ‚Šใจใ—ใŸๆง˜ๅญใงใใ†่จ€ใฃใŸใ€‚
ใ€Œใปใ†ใ€‚่ฆ‹ใ‚‹ใ‹ใ‚‰ใซ็พŽๅ‘ณใ—ใใ†ใ ใ€
็งใ‚‚ใใฎ่‰ฒๅˆใ„ใซ่ช˜ใ‚ใ‚Œใ‚‹ใ‚ˆใ†ใซใ—ใฆใ€ไธ€ๅฃ้ ฌ
```

### Intended Use

The model is mainly intended to be used for generating novels. It may not be so capable with instruction-based responses.


## Merge Details
### Merge Method

This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using evol_merge_storage\input_models\Antler7B_2159541861 as a base.

### Models Merged

The following models were included in the merge:
* evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
* evol_merge_storage\input_models\antler-starling-08_4074283220
* evol_merge_storage\input_models\Phos7b-RP_654656604

### Configuration

The following YAML configuration was used to produce this model:

```yaml
base_model: evol_merge_storage\input_models\Antler7B_2159541861
dtype: bfloat16
merge_method: dare_ties
parameters:
  int8_mask: 1.0
  normalize: 1.0
slices:
- sources:
  - layer_range: [0, 8]
    model: evol_merge_storage\input_models\Phos7b-RP_654656604
    parameters:
      density: 0.584107666175788
      weight: 0.47231634419785595
  - layer_range: [0, 8]
    model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
    parameters:
      density: 0.9357007414387093
      weight: 0.25531843586626907
  - layer_range: [0, 8]
    model: evol_merge_storage\input_models\antler-starling-08_4074283220
    parameters:
      density: 0.9750447748820433
      weight: 0.4753247646722287
  - layer_range: [0, 8]
    model: evol_merge_storage\input_models\Antler7B_2159541861
- sources:
  - layer_range: [8, 16]
    model: evol_merge_storage\input_models\Phos7b-RP_654656604
    parameters:
      density: 0.8802238329444649
      weight: 0.4482746205621599
  - layer_range: [8, 16]
    model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
    parameters:
      density: 1.0
      weight: 0.5524329574915081
  - layer_range: [8, 16]
    model: evol_merge_storage\input_models\antler-starling-08_4074283220
    parameters:
      density: 1.0
      weight: 0.22634815425570032
  - layer_range: [8, 16]
    model: evol_merge_storage\input_models\Antler7B_2159541861
- sources:
  - layer_range: [16, 24]
    model: evol_merge_storage\input_models\Phos7b-RP_654656604
    parameters:
      density: 0.9921437573982935
      weight: 0.44636209472148164
  - layer_range: [16, 24]
    model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
    parameters:
      density: 0.8757091247914811
      weight: 0.15431351637040108
  - layer_range: [16, 24]
    model: evol_merge_storage\input_models\antler-starling-08_4074283220
    parameters:
      density: 0.8667200206865777
      weight: 0.37827962987746055
  - layer_range: [16, 24]
    model: evol_merge_storage\input_models\Antler7B_2159541861
- sources:
  - layer_range: [24, 32]
    model: evol_merge_storage\input_models\Phos7b-RP_654656604
    parameters:
      density: 0.966615155256828
      weight: 0.5041762338947331
  - layer_range: [24, 32]
    model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
    parameters:
      density: 1.0
      weight: 0.22555101554235693
  - layer_range: [24, 32]
    model: evol_merge_storage\input_models\antler-starling-08_4074283220
    parameters:
      density: 0.7616963147939114
      weight: 0.397020374822854
  - layer_range: [24, 32]
    model: evol_merge_storage\input_models\Antler7B_2159541861
```