Update README.md
Browse files
README.md
CHANGED
|
@@ -34,7 +34,9 @@ This is a merge of pre-trained language models created using [mergekit](https://
|
|
| 34 |
This model was merged using the [DaRE-Linear](https://developer.nvidia.com/blog/an-introduction-to-model-merging-for-llms/#:~:text=that%20did%20not.-,DARE,DARE%20derives%20from%20the%20following:) (Drop And REscal in Linear) merging methods, with [medalpaca-7b](https://huggingface.co/medalpaca/medalpaca-7b) as a base.
|
| 35 |
|
| 36 |
* **DARE-Linear** (or DARE-Task Arithmetic) is the variant where the DARE-processed (sparsified and rescaled) delta parameters are merged using simple linear weighted averaging.
|
| 37 |
-
* The final merged model is obtained by adding the weighted sum of the sparsified task vectors back to the base model
|
|
|
|
|
|
|
| 38 |
|
| 39 |
### Models Merged
|
| 40 |
|
|
|
|
| 34 |
This model was merged using the [DaRE-Linear](https://developer.nvidia.com/blog/an-introduction-to-model-merging-for-llms/#:~:text=that%20did%20not.-,DARE,DARE%20derives%20from%20the%20following:) (Drop And REscal in Linear) merging methods, with [medalpaca-7b](https://huggingface.co/medalpaca/medalpaca-7b) as a base.
|
| 35 |
|
| 36 |
* **DARE-Linear** (or DARE-Task Arithmetic) is the variant where the DARE-processed (sparsified and rescaled) delta parameters are merged using simple linear weighted averaging.
|
| 37 |
+
* The final merged model is obtained by adding the weighted sum of the sparsified task vectors back to the base model:
|
| 38 |
+
胃_merged = 胃_base + 危 [ 伪_i * DARE(胃_i - 胃_base, p) ],
|
| 39 |
+
where DARE(蟿, p) denotes the operation of dropping parameters with probability p and rescaling the rest by 1/(1-p).聽
|
| 40 |
|
| 41 |
### Models Merged
|
| 42 |
|