sm280299 commited on
Commit
fdc5039
·
verified ·
1 Parent(s): be78005

Update model card with MetalRT benchmarks and usage

Browse files
Files changed (1) hide show
  1. README.md +37 -3
README.md CHANGED
@@ -1,3 +1,37 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama3.2
3
+ tags:
4
+ - mlx
5
+ - 4bit
6
+ - llama
7
+ - metalrt
8
+ - apple-silicon
9
+ ---
10
+
11
+ # Llama 3.2 3B — MLX 4-bit Quantized
12
+
13
+ Custom MLX 4-bit quantization of [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) optimized for [MetalRT](https://github.com/RunanywhereAI/metalrt-binaries) GPU inference on Apple Silicon.
14
+
15
+ ## Usage
16
+
17
+ Used by [RCLI](https://github.com/RunanywhereAI/RCLI) with the MetalRT engine:
18
+
19
+ ```bash
20
+ rcli setup # select MetalRT or Both engines
21
+ ```
22
+
23
+ ## Performance (Apple M3 Max)
24
+
25
+ | Metric | Value |
26
+ |--------|-------|
27
+ | Parameters | 3B |
28
+ | Quantization | MLX 4-bit |
29
+
30
+ ## License
31
+
32
+ Model weights: [Llama 3.2 Community License](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE) (Meta)
33
+ MetalRT engine: [Proprietary](https://github.com/RunanywhereAI/metalrt-binaries/blob/main/LICENSE) (RunAnywhere, Inc.)
34
+
35
+ ## Contact
36
+
37
+ founder@runanywhere.ai | https://runanywhere.ai