Local Models
Collection
16 items • Updated • 1
output = llm(
"Once upon a time,",
max_tokens=512,
echo=True
)
print(output)The Mixtral-7x8B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. The Mistral-7x8Boutperforms Llama 2 70B on most benchmarks we tested.
| No | Variant | Cortex CLI command |
|---|---|---|
| 1 | 7x8b-gguf | cortex run mixtral:7x8b-gguf |
cortexhub/mixtral
cortex run mixtral
We're not able to determine the quantization variants.
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="cortexso/mixtral", filename="model.gguf", )