Llama_32_3B_4bit / README.md
sm280299's picture
Update model card with MetalRT benchmarks and usage
fdc5039 verified
metadata
license: llama3.2
tags:
  - mlx
  - 4bit
  - llama
  - metalrt
  - apple-silicon

Llama 3.2 3B — MLX 4-bit Quantized

Custom MLX 4-bit quantization of meta-llama/Llama-3.2-3B-Instruct optimized for MetalRT GPU inference on Apple Silicon.

Usage

Used by RCLI with the MetalRT engine:

rcli setup          # select MetalRT or Both engines

Performance (Apple M3 Max)

Metric Value
Parameters 3B
Quantization MLX 4-bit

License

Model weights: Llama 3.2 Community License (Meta) MetalRT engine: Proprietary (RunAnywhere, Inc.)

Contact

founder@runanywhere.ai | https://runanywhere.ai