| --- |
| base_model: |
| - Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2 |
| - unsloth/Meta-Llama-3.1-8B-Instruct |
| - unsloth/Llama-3.1-Storm-8B |
| - arcee-ai/Llama-3.1-SuperNova-Lite |
| - VAGOsolutions/Llama-3.1-SauerkrautLM-8b-Instruct |
| library_name: transformers |
| tags: |
| - mergekit |
| - merge |
| model-index: |
| - name: ZEUS-8B-V22 |
| results: |
| - task: |
| type: text-generation |
| name: Text Generation |
| dataset: |
| name: IFEval (0-Shot) |
| type: wis-k/instruction-following-eval |
| split: train |
| args: |
| num_few_shot: 0 |
| metrics: |
| - type: inst_level_strict_acc and prompt_level_strict_acc |
| value: 79.95 |
| name: averaged accuracy |
| source: |
| url: >- |
| https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=T145%2FZEUS-8B-V22 |
| name: Open LLM Leaderboard |
| - task: |
| type: text-generation |
| name: Text Generation |
| dataset: |
| name: BBH (3-Shot) |
| type: SaylorTwift/bbh |
| split: test |
| args: |
| num_few_shot: 3 |
| metrics: |
| - type: acc_norm |
| value: 32.21 |
| name: normalized accuracy |
| source: |
| url: >- |
| https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=T145%2FZEUS-8B-V22 |
| name: Open LLM Leaderboard |
| - task: |
| type: text-generation |
| name: Text Generation |
| dataset: |
| name: MATH Lvl 5 (4-Shot) |
| type: lighteval/MATH-Hard |
| split: test |
| args: |
| num_few_shot: 4 |
| metrics: |
| - type: exact_match |
| value: 20.24 |
| name: exact match |
| source: |
| url: >- |
| https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=T145%2FZEUS-8B-V22 |
| name: Open LLM Leaderboard |
| - task: |
| type: text-generation |
| name: Text Generation |
| dataset: |
| name: GPQA (0-shot) |
| type: Idavidrein/gpqa |
| split: train |
| args: |
| num_few_shot: 0 |
| metrics: |
| - type: acc_norm |
| value: 10.4 |
| name: acc_norm |
| source: |
| url: >- |
| https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=T145%2FZEUS-8B-V22 |
| name: Open LLM Leaderboard |
| - task: |
| type: text-generation |
| name: Text Generation |
| dataset: |
| name: MuSR (0-shot) |
| type: TAUR-Lab/MuSR |
| args: |
| num_few_shot: 0 |
| metrics: |
| - type: acc_norm |
| value: 9.37 |
| name: acc_norm |
| source: |
| url: >- |
| https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=T145%2FZEUS-8B-V22 |
| name: Open LLM Leaderboard |
| - task: |
| type: text-generation |
| name: Text Generation |
| dataset: |
| name: MMLU-PRO (5-shot) |
| type: TIGER-Lab/MMLU-Pro |
| config: main |
| split: test |
| args: |
| num_few_shot: 5 |
| metrics: |
| - type: acc |
| value: 32.64 |
| name: accuracy |
| source: |
| url: >- |
| https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=T145%2FZEUS-8B-V22 |
| name: Open LLM Leaderboard |
| license: llama3.1 |
| --- |
| # ZEUS 8B 🌩️ V22 |
|
|
| This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). |
|
|
| ## Merge Details |
| ### Merge Method |
|
|
| This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [unsloth/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/unsloth/Meta-Llama-3.1-8B-Instruct) as a base. |
|
|
| ### Models Merged |
|
|
| The following models were included in the merge: |
| * [Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2](https://huggingface.co/Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2) |
| * [unsloth/Llama-3.1-Storm-8B](https://huggingface.co/unsloth/Llama-3.1-Storm-8B) |
| * [arcee-ai/Llama-3.1-SuperNova-Lite](https://huggingface.co/arcee-ai/Llama-3.1-SuperNova-Lite) |
| * [VAGOsolutions/Llama-3.1-SauerkrautLM-8b-Instruct](https://huggingface.co/VAGOsolutions/Llama-3.1-SauerkrautLM-8b-Instruct) |
|
|
| ### Configuration |
|
|
| The following YAML configuration was used to produce this model: |
|
|
| ```yaml |
| base_model: unsloth/Meta-Llama-3.1-8B-Instruct |
| dtype: bfloat16 |
| merge_method: dare_ties |
| parameters: |
| int8_mask: 1.0 |
| normalize: 1.0 |
| random_seed: 145.0 |
| slices: |
| - sources: |
| - layer_range: [0, 32] |
| model: unsloth/Llama-3.1-Storm-8B |
| parameters: |
| density: 0.94 |
| weight: 0.35 |
| - layer_range: [0, 32] |
| model: arcee-ai/Llama-3.1-SuperNova-Lite |
| parameters: |
| density: 0.92 |
| weight: 0.26 |
| - layer_range: [0, 32] |
| model: VAGOsolutions/Llama-3.1-SauerkrautLM-8b-Instruct |
| parameters: |
| density: 0.91 |
| weight: 0.2 |
| - layer_range: [0, 32] |
| model: Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2 |
| parameters: |
| density: 0.93 |
| weight: 0.19 |
| - layer_range: [0, 32] |
| model: unsloth/Meta-Llama-3.1-8B-Instruct |
| tokenizer: |
| tokens: |
| <|begin_of_text|>: |
| force: true |
| source: unsloth/Meta-Llama-3.1-8B-Instruct |
| <|eot_id|>: |
| force: true |
| source: unsloth/Meta-Llama-3.1-8B-Instruct |
| <|finetune_right_pad_id|>: |
| force: true |
| source: unsloth/Meta-Llama-3.1-8B-Instruct |
| ``` |
|
|
| # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) |
| Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/T145__ZEUS-8B-V22-details)! |
| Summarized results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/contents/viewer/default/train?q=T145%2FZEUS-8B-V22&sort[column]=Average%20%E2%AC%86%EF%B8%8F&sort[direction]=desc)! |
|
|
| | Metric |Value (%)| |
| |-------------------|--------:| |
| |**Average** | 30.80| |
| |IFEval (0-Shot) | 79.95| |
| |BBH (3-Shot) | 32.21| |
| |MATH Lvl 5 (4-Shot)| 20.24| |
| |GPQA (0-shot) | 10.40| |
| |MuSR (0-shot) | 9.37| |
| |MMLU-PRO (5-shot) | 32.64| |