From original readme

tl;dr: AlphaMonarch-7B is a new DPO merge that retains all the reasoning abilities of the very best merges and significantly improves its conversational abilities. Kind of the best of both worlds in a 7B model. 🎉

AlphaMonarch-7B is a DPO fine-tuned of mlabonne/NeuralMonarch-7B using the argilla/OpenHermes2.5-dpo-binarized-alpha preference dataset.

It is based on a merge of the following models using LazyMergekit:

Special thanks to Jon Durbin, Intel, Argilla, and Teknium for the preference datasets.

🔍 Applications

This model uses a context window of 8k. I recommend using it with the Mistral Instruct chat template (works perfectly with LM Studio).

If you use SillyTavern, you might want to tweak the inference parameters. Here's what LM Studio uses as a reference: temp 0.8, top_k 40, top_p 0.95, min_p 0.05, repeat_penalty 1.1.

It is one of the very best 7B models in terms of instructing following and reasoning abilities and can be used for conversations, RP, and storytelling. Note that it tends to have a quite formal and sophisticated style, but it can be changed by modifying the prompt.

Downloads last month: 784

GGUF

Model size

7B params

Architecture

llama

Hardware compatibility

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit