Instructions to use alankessler/Mistral-Nemo-Instruct-2407-MLX-2bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use alankessler/Mistral-Nemo-Instruct-2407-MLX-2bit with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir Mistral-Nemo-Instruct-2407-MLX-2bit alankessler/Mistral-Nemo-Instruct-2407-MLX-2bit
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
Mistral-Nemo-Instruct-2407-MLX-2bit
MLX quantized version of Mistral Nemo Instruct 2407.
Quantization
- Method: Q2 (2-bit integer quantization)
- Bits per weight: 2
- Details: Uniform 2-bit integer quantization with group size 64.
- Converted with: mlx-lm
Usage
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("alankessler/Mistral-Nemo-Instruct-2407-MLX-2bit")
prompt = tokenizer.apply_chat_template(
[{"role": "user", "content": "Hello!"}],
add_generation_prompt=True,
tokenize=False,
)
response = generate(model, tokenizer, prompt=prompt, max_tokens=512)
print(response)
Base Model
- Model: Mistral Nemo Instruct 2407
- Parameters: 12B
- Architecture: Mistral Nemo
- License: Apache 2.0
- Downloads last month
- 24
Model size
1B params
Tensor type
BF16
·
U32 ·
Hardware compatibility
Log In to add your hardware
2-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for alankessler/Mistral-Nemo-Instruct-2407-MLX-2bit
Base model
mistralai/Mistral-Nemo-Base-2407 Finetuned
mistralai/Mistral-Nemo-Instruct-2407