| --- |
| base_model: |
| - CrucibleLab/L3.3-70B-Loki-V2.0 |
| - meta-llama/Llama-3.1-70B |
| - schonsense/Tropoplectic |
| library_name: transformers |
| tags: |
| - mergekit |
| - merge |
|
|
| --- |
| # Bragi3 |
|
|
| Too sloppy for my tastes. |
|
|
|
|
| This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). |
|
|
| ## Merge Details |
| ### Merge Method |
|
|
| This model was merged using the NuSLERP merge method using [meta-llama/Llama-3.1-70B](https://huggingface.co/meta-llama/Llama-3.1-70B) as a base. |
|
|
| ### Models Merged |
|
|
| The following models were included in the merge: |
| * [CrucibleLab/L3.3-70B-Loki-V2.0](https://huggingface.co/CrucibleLab/L3.3-70B-Loki-V2.0) |
| * [schonsense/Tropoplectic](https://huggingface.co/schonsense/Tropoplectic) |
|
|
| ### Configuration |
|
|
| The following YAML configuration was used to produce this model: |
|
|
| ```yaml |
| models: |
| - model: CrucibleLab/L3.3-70B-Loki-V2.0 |
| parameters: |
| weight: |
| - filter: q_proj |
| value: [0.80, 0.30, 0.30, 0.30, 0.8] |
| - filter: k_proj |
| value: [0.70, 0.20, 0.20, 0.20, 0.7] |
| - filter: v_proj |
| value: [0.80, 0.40, 0.40, 0.40, 0.8] |
| - filter: o_proj |
| value: [0.90, 0.80, 0.80, 0.80, 0.9] |
| - filter: gate_proj |
| value: [0.80, 0.20, 0.20, 0.20, 0.8] |
| - filter: up_proj |
| value: [0.80, 0.30, 0.30, 0.30, 0.8] |
| - filter: down_proj |
| value: [0.90, 0.80, 0.80, 0.80, 0.9] |
| - filter: lm_head |
| value: 0.95 |
| - value: 1 |
| |
| |
| |
| - model: schonsense/Tropoplectic |
| parameters: |
| weight: |
| - filter: q_proj |
| value: [0.20, 0.70, 0.70, 0.70, 0.2] |
| - filter: k_proj |
| value: [0.30, 0.80, 0.80, 0.80, 0.3] |
| - filter: v_proj |
| value: [0.20, 0.60, 0.60, 0.60, 0.2] |
| - filter: o_proj |
| value: [0.10, 0.25, 0.25, 0.25, 0.1] |
| - filter: gate_proj |
| value: [0.20, 0.80, 0.80, 0.80, 0.2] |
| - filter: up_proj |
| value: [0.20, 0.70, 0.70, 0.70, 0.2] |
| - filter: down_proj |
| value: [0.10, 0.25, 0.25, 0.25, 0.1] |
| - filter: lm_head |
| value: 0.05 |
| - value: 0 |
| |
| base_model: meta-llama/Llama-3.1-70B |
| merge_method: nuslerp |
| |
| parameters: |
| normalize: false |
| int8_mask: false |
| rescale: false |
| |
| dtype: float32 |
| out_dtype: bfloat16 |
| |
| chat_template: llama3 |
| tokenizer: |
| source: union |
| pad_to_multiple_of: 8 |
| |
| ``` |
|
|