leftfooted
/

DUSFT-llm-model

Model card Files Files and versions

DUSFT-llm-model / README.md

leftfooted's picture

Update README.md

64fbae9 verified about 1 year ago

|

history blame contribute delete

862 Bytes

	---
	license: mit
	---
	# DUS Forty Layer Merged Model

	## Overview
	The DUS Forty Layer Merged Model leverages a unique layer interlocking strategy, combining layers from the Llama-2-13B and Mistral-7B architectures. This approach optimizes computational efficiency while maintaining competitive performance across various natural language processing tasks.

	## Model Details
	- Architecture: Based on Llama-2-13B and Mistral-7B
	- Layer Arrangement: The `forty` configuration merges layers from both models, interlocking layers 0–20 with layers 12–32.
	- Tokenizer: Mistral-7B tokenizer is used for encoding and decoding.

	## Training Details
	- Base Models:
	- Llama-2-13B: [meta-llama/Llama-2-13b-hf](https://huggingface.co/meta-llama/Llama-2-13b-hf)
	- Mistral-7B: [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)