rootxhacker
/

mini-llama-400M

Text Generation

text-generation-inference

Model card Files Files and versions

mini-llama-400M / mergekit_config.yml

rootxhacker's picture

Upload folder using huggingface_hub

7131d5d verified about 1 year ago

history blame contribute delete

869 Bytes

	slices:
	- sources:
	- model: rootxhacker/mini-Llama-70M-SFT-v2 # Core reasoning
	layer_range: [0, 5] # Full 6 layers
	- sources:
	- model: rootxhacker/mini-Llama-70M-SFT-COT
	layer_range: [0, 5] # Full 6 layers
	- sources:
	- model: rootxhacker/mini-Llama-70M-SFT-medical
	layer_range: [0, 5]
	- sources:
	- model: rootxhacker/mini-Llama-70M-SFT-code
	layer_range: [0, 5]
	- sources:
	- model: rootxhacker/mini-Llama-70M-SFT-math
	layer_range: [0, 5]
	- sources:
	- model: rootxhacker/mini-Llama-70M-SFT-ifeval
	layer_range: [0, 4] # 5 layers
	- sources:
	- model: rootxhacker/mini-Llama-70M-SFT-v2
	layer_range: [0, 4] # 5 layers
	- sources:
	- model: rootxhacker/mini-Llama-70M-SFT
	layer_range: [0, 4] # 5 layers
	merge_method: passthrough
	dtype: bfloat16