--- base_model: - rootxhacker/mini-Llama-70M-SFT-ifeval - rootxhacker/mini-Llama-70M-SFT - rootxhacker/mini-Llama-70M-SFT-medical - rootxhacker/mini-Llama-70M-SFT-math - rootxhacker/mini-Llama-70M-SFT-v2 - rootxhacker/mini-Llama-70M-SFT-code - rootxhacker/mini-Llama-70M-SFT-COT library_name: transformers tags: - mergekit - merge --- # merge This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the Passthrough merge method. ### Models Merged The following models were included in the merge: * [rootxhacker/mini-Llama-70M-SFT-ifeval](https://huggingface.co/rootxhacker/mini-Llama-70M-SFT-ifeval) * [rootxhacker/mini-Llama-70M-SFT](https://huggingface.co/rootxhacker/mini-Llama-70M-SFT) * [rootxhacker/mini-Llama-70M-SFT-medical](https://huggingface.co/rootxhacker/mini-Llama-70M-SFT-medical) * [rootxhacker/mini-Llama-70M-SFT-math](https://huggingface.co/rootxhacker/mini-Llama-70M-SFT-math) * [rootxhacker/mini-Llama-70M-SFT-v2](https://huggingface.co/rootxhacker/mini-Llama-70M-SFT-v2) * [rootxhacker/mini-Llama-70M-SFT-code](https://huggingface.co/rootxhacker/mini-Llama-70M-SFT-code) * [rootxhacker/mini-Llama-70M-SFT-COT](https://huggingface.co/rootxhacker/mini-Llama-70M-SFT-COT) ### Configuration The following YAML configuration was used to produce this model: ```yaml slices: - sources: - model: rootxhacker/mini-Llama-70M-SFT-v2 # Replace with actual model IDs layer_range: [0, 5] # All 6 layers - sources: - model: rootxhacker/mini-Llama-70M-SFT-COT layer_range: [0, 4] # First 5 layers - sources: - model: rootxhacker/mini-Llama-70M-SFT-medical layer_range: [0, 4] - sources: - model: rootxhacker/mini-Llama-70M-SFT-code layer_range: [0, 4] - sources: - model: rootxhacker/mini-Llama-70M-SFT-math layer_range: [0, 4] - sources: - model: rootxhacker/mini-Llama-70M-SFT-ifeval layer_range: [0, 4] - sources: - model: rootxhacker/mini-Llama-70M-SFT-v2 layer_range: [0, 4] - sources: - model: rootxhacker/mini-Llama-70M-SFT layer_range: [0, 3] merge_method: passthrough dtype: bfloat16 ```