issafuad
/

How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="issafuad/miqu",
	filename="",
)
llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

miqu 70b

First model in the potential series.

Prompt format: Mistral

<s> [INST] QUERY_1 [/INST] ANSWER_1</s> [INST] QUERY_2 [/INST] ANSWER_2</s>...

Beware that some backends (like llama.cpp) add bos already (by default), so you don't need to prepend it yourself.

Settings

DO NOT CHANGE ROPE SETTINGS. This model uses high freq base with 32k seen tokens, it should be fine for most tasks.

Only tested with temp 1 and top_p 0.95 with everything else disabled.

Downloads last month
4
GGUF
Model size
69B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

2-bit

4-bit

5-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support