--- base_model: - rootxhacker/mini-Llama-70M-SFT-math - rootxhacker/mini-Llama-70M-SFT - rootxhacker/mini-Llama-70M-SFT-ifeval - rootxhacker/mini-Llama-70M-SFT-COT - rootxhacker/mini-Llama-70M-SFT-medical - rootxhacker/mini-Llama-70M-SFT-v2 - rootxhacker/mini-Llama-70M-SFT-code library_name: transformers tags: - mergekit - merge --- # merge This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the Passthrough merge method. ### Models Merged The following models were included in the merge: * [rootxhacker/mini-Llama-70M-SFT-math](https://huggingface.co/rootxhacker/mini-Llama-70M-SFT-math) * [rootxhacker/mini-Llama-70M-SFT](https://huggingface.co/rootxhacker/mini-Llama-70M-SFT) * [rootxhacker/mini-Llama-70M-SFT-ifeval](https://huggingface.co/rootxhacker/mini-Llama-70M-SFT-ifeval) * [rootxhacker/mini-Llama-70M-SFT-COT](https://huggingface.co/rootxhacker/mini-Llama-70M-SFT-COT) * [rootxhacker/mini-Llama-70M-SFT-medical](https://huggingface.co/rootxhacker/mini-Llama-70M-SFT-medical) * [rootxhacker/mini-Llama-70M-SFT-v2](https://huggingface.co/rootxhacker/mini-Llama-70M-SFT-v2) * [rootxhacker/mini-Llama-70M-SFT-code](https://huggingface.co/rootxhacker/mini-Llama-70M-SFT-code) ### Configuration The following YAML configuration was used to produce this model: ```yaml slices: - sources: - model: rootxhacker/mini-Llama-70M-SFT-v2 # Core reasoning layer_range: [0, 5] # Full 6 layers - sources: - model: rootxhacker/mini-Llama-70M-SFT-COT layer_range: [0, 5] # Full 6 layers - sources: - model: rootxhacker/mini-Llama-70M-SFT-medical layer_range: [0, 5] - sources: - model: rootxhacker/mini-Llama-70M-SFT-code layer_range: [0, 5] - sources: - model: rootxhacker/mini-Llama-70M-SFT-math layer_range: [0, 5] - sources: - model: rootxhacker/mini-Llama-70M-SFT-ifeval layer_range: [0, 4] # 5 layers - sources: - model: rootxhacker/mini-Llama-70M-SFT-v2 layer_range: [0, 4] # 5 layers - sources: - model: rootxhacker/mini-Llama-70M-SFT layer_range: [0, 4] # 5 layers merge_method: passthrough dtype: bfloat16 ```