File size: 7,200 Bytes
69ae844 6d45dcd 69ae844 6d45dcd 69ae844 6d45dcd |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 |
---
base_model: []
library_name: transformers
tags:
- mergekit
- merge
license: apache-2.0
---
# Antler 7B Evolve
<img src="https://huggingface.co/Elizezen/Antler-7B/resolve/main/OIG3.UAjshTXCEJU.jpg" alt="drawing" style="width:512px;"/>
## Model Description
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit), using **Evolutionary Model Merging**.
Generally better than Antler-7B at writing novels, especially at maintaining context, but can fall short on eroticism compared to the original model. It also tends to generate eos tokens quite early, which I'm currently working on improving.
## Example
### Input
```
ใใฎๆฅใฎๆผไธใใใ็งใจใใใฏใๆใใฎๆตด่กฃใ่บซใซ็บใใๅฌ็ฅญใใๆฅฝใใใใใซ็บใธใจ็นฐใๅบใใฆใใใใใใใๅฟ่
ใฎ็ด ๆงใ้ ใใใใซ็ๅค่ฃ
ใใฆใใใ
ๆฎๆฎต็ๆ
ฃใใชใๆ่ฃ
ใฎใใๅฐใ
่ฝใก็ใใชใใใฎใฎใๅธฏใ็ทฉใใ ใ็ๅดฉใใใชใใใๆ่ญใใชใใๆญฉใใใใใจใใใๅใใใใซใใใกใชใๆญฉใใฆใใใฎใๅใใฃใใ
ใใใฆ่กไธฆใฟใฏๆดปๆฐใซๆบใกๅงใใ้่กใไบบใ
ใฎ่ณใใใชๅฃฐใ่ใใใฆใใใ
ๅบๅ ดใซๅฐ็ใใใจใใใใฏๅคงๅขใฎไบบใ
ใงใซใใใใ่ฒใจใใฉใใฎๆ็ฏใ่พบใใ็
งใใใฆใใใๆงใ
ใชๅบๅบใไธฆใณใๅคงๅขใฎๅญไพ้ใจใใฎ่ฆชๅพกใใใ้งใๅใฃใฆใใใ
ๅบๅ ดใฎไธญๅคฎไป่ฟใซใใ่ๅฐใงใฏๅฌใ็ฉใ้ๅฌใใใฆใใใๅคช้ผใฎ้ณใซๅใใใฆๆญใฃใฆใใๆผ่
ใใกใใใใ
ใใใ๏ฝใใใใ๏ผใ
็ฎใ่ผใใใฆ่พบใใ่ฆๅใใใใใใใใฆใฟใใจๅนด็ธๅฟใฎๅญใฉใใซ่ฆใใใ
ใใใใๅๆใซ่ตฐใๅใใชใใ
ใใใธใธ๏ฝใใใใชใใใ
ใใใชใใใใชใใใใ้กใฏ็ถปใใงใใๆงๅญใใใๅใใ้ใใๅฝผๅฅณใๆฅฝใใฟใซใใฆใใใฎใฏๆใใใ ใใใ
ใใใๆฅฝใใใใ ใจใ็งใๅฌใใใไธๆ่ญฐใชใใฎใ ใชใไปใพใงใใใชๆฐๆใกใซใชใฃใใใจใฏใชใใฃใใ
ๆใใ็งใพใง็ฌ้กใซใชใฃใฆใใพใใ
ใใใฎใกใใใใใฎใกใใ๏ผ่ฆใฆ๏ผใใฎใกใใใกใๅฏๆใ๏ฝ๏ผใ
ใใใใ
ๅฝผๅฅณใซ่ขใๅผใใใฆ้ฃใใฆ่กใใใใฎใฏใๅฏๆใใใๅ็ฉใฎ็ตตๆใใใใใใใ่ก็ใใฉใใใๅฃฒๅบใๅบใใฆใใใใใใ
ๅฑๅฐใงใฏ
```
### Output
```
ๅฑๅฐใงใฏ็ผใใใฐใ็ผใ้ณฅใชใฉใฎๅฎ็ชใจไธฆใใงใๅฌ้ๅฎใฎ้ๅฐ้ๅบใชใฉใๅบใฆใใใ
ใใใใฏใชใซใ้ฃในใใใใฎใใใ๏ผใ
ใใใฃใจใญ๏ฝใใๅฅฝใฟ็ผใใจใโฆโฆใ
ใใๅฅฝใฟ็ผใ๏ผใ
็งใฏๅใใฆ่ใๆ็ๅใซ้ฆใๅพใใใใใใฏ็ฎใใญใฉใญใฉใใใชใใใใใฎๆ็ใซใคใใฆ่ชฌๆใใฆใใใใ
ๅฐ้บฆ็ฒใฎ็ๅฐใซใญใฃใใใ่ฑ่ใๅคฉใใใใใใฆใๅฅฝใฟ็ผใใฝใผในใใใใฆ็ผใใใๅคง้ชๅ็ฉใฎๆ็ใใใใ
ใใใใฏใพใ้ข็ฝใใใชใใฎใ ใชใใงใฏใใใฎๅฑๅฐใซ่กใฃใฆใฟใใใ
็ง้ใฏ็ฎๆใใคใใๅฑๅฐใธๅใใใใจใซใใใ
ใๅฅฝใฟ็ผใใฎๅฑๅฐใฏใไบๆณไปฅไธใฎ็ๆณใถใใง่กๅใใงใใฆใใใใใใใไธฆใถใใจ30ๅใปใฉใง็งใใกใฎ็ชใใใฃใฆใใใ
ใใใฐใกใใใใใใใ ใใใ
ใใใใ๏ผใกใใฃใจๅพ
ใฃใฆใช๏ผใ
ๅฑๅฐใฎใใฐใกใใใฏๅจๅขใฎใใๅฃฐใง่ฟไบใใใใจใๆๆ
ฃใใๆงๅญใง้ๆฟใฎไธใงใๅฅฝใฟ็ผใใ็ผใไธใใใ
ใใใใใๅฅฝใฟ็ผใใ ใใ
ๅบๆฅไธใใฃใใๅฅฝใฟ็ผใใๆใซใใใใใใใใฃใจใใจใใๆงๅญใงใใ่จใฃใใ
ใใปใใ่ฆใใใใซ็พๅณใใใใ ใ
็งใใใฎ่ฒๅใใซ่ชใใใใใใซใใฆใไธๅฃ้ ฌ
```
### Intended Use
The model is mainly intended to be used for generating novels. It may not be so capable with instruction-based responses.
## Merge Details
### Merge Method
This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using evol_merge_storage\input_models\Antler7B_2159541861 as a base.
### Models Merged
The following models were included in the merge:
* evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
* evol_merge_storage\input_models\antler-starling-08_4074283220
* evol_merge_storage\input_models\Phos7b-RP_654656604
### Configuration
The following YAML configuration was used to produce this model:
```yaml
base_model: evol_merge_storage\input_models\Antler7B_2159541861
dtype: bfloat16
merge_method: dare_ties
parameters:
int8_mask: 1.0
normalize: 1.0
slices:
- sources:
- layer_range: [0, 8]
model: evol_merge_storage\input_models\Phos7b-RP_654656604
parameters:
density: 0.584107666175788
weight: 0.47231634419785595
- layer_range: [0, 8]
model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
parameters:
density: 0.9357007414387093
weight: 0.25531843586626907
- layer_range: [0, 8]
model: evol_merge_storage\input_models\antler-starling-08_4074283220
parameters:
density: 0.9750447748820433
weight: 0.4753247646722287
- layer_range: [0, 8]
model: evol_merge_storage\input_models\Antler7B_2159541861
- sources:
- layer_range: [8, 16]
model: evol_merge_storage\input_models\Phos7b-RP_654656604
parameters:
density: 0.8802238329444649
weight: 0.4482746205621599
- layer_range: [8, 16]
model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
parameters:
density: 1.0
weight: 0.5524329574915081
- layer_range: [8, 16]
model: evol_merge_storage\input_models\antler-starling-08_4074283220
parameters:
density: 1.0
weight: 0.22634815425570032
- layer_range: [8, 16]
model: evol_merge_storage\input_models\Antler7B_2159541861
- sources:
- layer_range: [16, 24]
model: evol_merge_storage\input_models\Phos7b-RP_654656604
parameters:
density: 0.9921437573982935
weight: 0.44636209472148164
- layer_range: [16, 24]
model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
parameters:
density: 0.8757091247914811
weight: 0.15431351637040108
- layer_range: [16, 24]
model: evol_merge_storage\input_models\antler-starling-08_4074283220
parameters:
density: 0.8667200206865777
weight: 0.37827962987746055
- layer_range: [16, 24]
model: evol_merge_storage\input_models\Antler7B_2159541861
- sources:
- layer_range: [24, 32]
model: evol_merge_storage\input_models\Phos7b-RP_654656604
parameters:
density: 0.966615155256828
weight: 0.5041762338947331
- layer_range: [24, 32]
model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
parameters:
density: 1.0
weight: 0.22555101554235693
- layer_range: [24, 32]
model: evol_merge_storage\input_models\antler-starling-08_4074283220
parameters:
density: 0.7616963147939114
weight: 0.397020374822854
- layer_range: [24, 32]
model: evol_merge_storage\input_models\Antler7B_2159541861
``` |