llm.create_chat_completion(
messages = "No input example has been defined for this model task."
)This is a quantized GGUF of mistralai/Mistral-Nemo-Instruct-2407. Requires llama.cpp newer than commit 50e0535 (7/22/2024) to run inference.
Currently, we just have a Q5_K quantization which comes in at 8.73 GB. If you're interested other quantizations, just ping me @iamlemec on Twitter.
- Downloads last month
- 17
Hardware compatibility
Log In to add your hardware
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="CompendiumLabs/mistral-nemo-instruct-2407-gguf", filename="mistral-nemo-instruct-q5_k.gguf", )