Instructions to use alankessler/Mistral-Small-3.2-24B-Instruct-2506-MLX-8bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use alankessler/Mistral-Small-3.2-24B-Instruct-2506-MLX-8bit with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir Mistral-Small-3.2-24B-Instruct-2506-MLX-8bit alankessler/Mistral-Small-3.2-24B-Instruct-2506-MLX-8bit
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
Configuration Parsing Warning:Config file tokenizer_config.json cannot be fetched (too big)
Mistral-Small-3.2-24B-Instruct-2506-MLX-8bit
MLX quantized version of Mistral Small 3.2 24B Instruct 2506.
Quantization
- Method: Q8 (8-bit integer quantization)
- Bits per weight: 8
- Details: Uniform 8-bit integer quantization with group size 64.
- Converted with: mlx-lm
Usage
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("alankessler/Mistral-Small-3.2-24B-Instruct-2506-MLX-8bit")
prompt = tokenizer.apply_chat_template(
[{"role": "user", "content": "Hello!"}],
add_generation_prompt=True,
tokenize=False,
)
response = generate(model, tokenizer, prompt=prompt, max_tokens=512)
print(response)
Base Model
- Model: Mistral Small 3.2 24B Instruct 2506
- Parameters: 24B
- Architecture: Mistral Small 3.2
- License: Apache 2.0
- Downloads last month
- 69
Model size
24B params
Tensor type
BF16
·
U32 ·
Hardware compatibility
Log In to add your hardware
8-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for alankessler/Mistral-Small-3.2-24B-Instruct-2506-MLX-8bit
Base model
mistralai/Mistral-Small-3.1-24B-Base-2503