| --- |
| license: other |
| license_name: deepseek |
| license_link: https://github.com/deepseek-ai/DeepSeek-Coder/blob/main/LICENSE-MODEL |
| --- |
| DeepMagic-Coder-7b |
|
|
| Alternate version: |
| - https://huggingface.co/rombodawg/DeepMagic-Coder-7b-Alt |
|
|
|  |
|
|
| This is an extremely successful merge of the deepseek-coder-6.7b-instruct and Magicoder-S-DS-6.7B models, bringing an uplift in overall coding performance without any compromise to the models integrity (at least with limited testing). |
|
|
| This is the first of my models to use the merge-kits *task_arithmetic* merging method. The method is detailed bellow, and its clearly very usefull for merging ai models that were fine-tuned from a common base: |
|
|
| Task Arithmetic: |
| ``` |
| Computes "task vectors" for each model by subtracting a base model. |
| Merges the task vectors linearly and adds back the base. |
| Works great for models that were fine tuned from a common ancestor. |
| Also a super useful mental framework for several of the more involved |
| merge methods. |
| ``` |
|
|
| The original models used in this merge can be found here: |
|
|
| - https://huggingface.co/ise-uiuc/Magicoder-S-DS-6.7B |
|
|
| - https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct |
|
|
|
|
| The Merge was created using Mergekit and the paremeters can be found bellow: |
| ```yaml |
| models: |
| - model: deepseek-ai_deepseek-coder-6.7b-instruct |
| parameters: |
| weight: 1 |
| - model: ise-uiuc_Magicoder-S-DS-6.7B |
| parameters: |
| weight: 1 |
| merge_method: task_arithmetic |
| base_model: ise-uiuc_Magicoder-S-DS-6.7B |
| parameters: |
| normalize: true |
| int8_mask: true |
| dtype: float16 |
| ``` |