| | --- |
| | base_model: [codellama/CodeLlama-70b-Instruct-hf] |
| | tags: |
| | - mergekit |
| | - merge |
| | - code |
| | license: mit |
| | pipeline_tag: conversational |
| | --- |
| | # BigCodeLLama 92b LFG 🚀 |
| |
|
| | ## Experimental 92B CodeLlaMA frankenstein to see how it benchmarks |
| |
|
| | ### Models Merged with base ```codellama/CodeLlama-70b-Instruct-hf``` |
| |
|
| |
|
| |
|
| | ### Models Merged |
| |
|
| | The following models were included in the merge: |
| | * ../CodeLlama-70b-Python-hf |
| | * ../CodeLlama-70b-Instruct-hf |
| |
|
| | ### Configuration |
| |
|
| | The following YAML configuration was used to produce this model: |
| |
|
| | ```yaml |
| | dtype: bfloat16 |
| | merge_method: passthrough |
| | slices: |
| | - sources: |
| | - layer_range: [0, 69] |
| | model: |
| | model: |
| | path: ../CodeLlama-70b-Instruct-hf |
| | - sources: |
| | - layer_range: [42, 80] |
| | model: |
| | model: |
| | path: ../CodeLlama-70b-Python-hf |
| | ``` |
| | Gguf available here https://huggingface.co/nisten/BigCodeLlama-92b-GGUF |