tomkay commited on
Commit
545e6c1
·
verified ·
1 Parent(s): 028e547

Fix: normalise line endings (CR removal) in README

Browse files
Files changed (1) hide show
  1. README.md +100 -50
README.md CHANGED
@@ -1,50 +1,100 @@
1
- ---
2
- library_name: mlx
3
- tags:
4
- - mlx
5
- - quantized
6
- - mixed-precision
7
- - minimax
8
- - moe
9
- license: other
10
- license_name: minimax-open-model-license
11
- base_model: MiniMaxAI/MiniMax-M2.5
12
- base_model_relation: quantized
13
- ---
14
-
15
- # MiniMax-M2.5 — 103GB (MLX)
16
-
17
- Mixed-precision quantized version of [MiniMaxAI/MiniMax-M2.5](https://huggingface.co/MiniMaxAI/MiniMax-M2.5) optimised by [baa.ai](https://baa.ai) using a proprietary Black Sheep AI method.
18
-
19
- Per-tensor bit-width allocation via advanced sensitivity analysis and budget-constrained optimisation — no calibration data required.
20
-
21
- ## Metrics
22
-
23
- | Metric | Value |
24
- |--------|-------|
25
- | **Size** | **103 GB** |
26
- | Average bits | 3.9 |
27
- | WikiText-2 PPL (median) | 9.7554 |
28
-
29
- ## Usage
30
-
31
- ```python
32
- from mlx_lm import load, generate
33
-
34
- model, tokenizer = load("baa-ai/MiniMax-M2.5-RAM-120GB-MLX")
35
- response = generate(model, tokenizer, prompt="Hello!", max_tokens=256)
36
- print(response)
37
- ```
38
-
39
- ---
40
- *Quantized by [baa.ai](https://baa.ai)*
41
-
42
- ---
43
-
44
- ## Black Sheep AI Products
45
-
46
- **[Shepherd](https://baa.ai/shepherd.html)** — Private AI deployment platform that shrinks frontier models by 50-60% through RAM compression, enabling enterprises to run sophisticated AI on single GPU instances or Apple Silicon hardware. Deploy in your VPC with zero data leaving your infrastructure. Includes CI/CD pipeline integration, fleet deployment across Apple Silicon clusters, air-gapped and sovereign deployment support, and multi-format export (MLX, GGUF). Annual cloud costs from ~$2,700 — or run on a Mac Studio for electricity only.
47
-
48
- **[Watchman](https://baa.ai/watchman.html)** — Capability audit and governance platform for compressed AI models. Know exactly what your quantized model can do before it goes live. Watchman predicts which capabilities survive compression in minutes — replacing weeks of benchmarking. Includes compliance-ready reporting for regulated industries, quality valley warnings for counterproductive memory allocations, instant regression diagnosis tracing issues to specific tensors, and 22 adversarial security probes scanning for injection, leakage, hallucination, and code vulnerabilities.
49
-
50
- Learn more at **[baa.ai](https://baa.ai)** — Sovereign AI.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+
3
+ library_name: mlx
4
+
5
+ tags:
6
+
7
+ - mlx
8
+
9
+ - quantized
10
+
11
+ - mixed-precision
12
+
13
+ - minimax
14
+
15
+ - moe
16
+
17
+ license: other
18
+
19
+ license_name: minimax-open-model-license
20
+
21
+ base_model: MiniMaxAI/MiniMax-M2.5
22
+
23
+ base_model_relation: quantized
24
+
25
+ ---
26
+
27
+
28
+
29
+ # MiniMax-M2.5 — 103GB (MLX)
30
+
31
+
32
+
33
+ Mixed-precision quantized version of [MiniMaxAI/MiniMax-M2.5](https://huggingface.co/MiniMaxAI/MiniMax-M2.5) optimised by [baa.ai](https://baa.ai) using a proprietary Black Sheep AI method.
34
+
35
+
36
+
37
+ Per-tensor bit-width allocation via advanced sensitivity analysis and budget-constrained optimisation — no calibration data required.
38
+
39
+
40
+
41
+ ## Metrics
42
+
43
+
44
+
45
+ | Metric | Value |
46
+
47
+ |--------|-------|
48
+
49
+ | **Size** | **103 GB** |
50
+
51
+ | Average bits | 3.9 |
52
+
53
+ | WikiText-2 PPL (median) | 9.7554 |
54
+
55
+
56
+
57
+ ## Usage
58
+
59
+
60
+
61
+ ```python
62
+
63
+ from mlx_lm import load, generate
64
+
65
+
66
+
67
+ model, tokenizer = load("baa-ai/MiniMax-M2.5-RAM-120GB-MLX")
68
+
69
+ response = generate(model, tokenizer, prompt="Hello!", max_tokens=256)
70
+
71
+ print(response)
72
+
73
+ ```
74
+
75
+
76
+
77
+ ---
78
+
79
+ *Quantized by [baa.ai](https://baa.ai)*
80
+
81
+
82
+
83
+ ---
84
+
85
+
86
+
87
+ ## Black Sheep AI Products
88
+
89
+
90
+
91
+ **[Shepherd](https://baa.ai/shepherd.html)** — Private AI deployment platform that shrinks frontier models by 50-60% through RAM compression, enabling enterprises to run sophisticated AI on single GPU instances or Apple Silicon hardware. Deploy in your VPC with zero data leaving your infrastructure. Includes CI/CD pipeline integration, fleet deployment across Apple Silicon clusters, air-gapped and sovereign deployment support, and multi-format export (MLX, GGUF). Annual cloud costs from ~$2,700 — or run on a Mac Studio for electricity only.
92
+
93
+
94
+
95
+ **[Watchman](https://baa.ai/watchman.html)** — Capability audit and governance platform for compressed AI models. Know exactly what your quantized model can do before it goes live. Watchman predicts which capabilities survive compression in minutes — replacing weeks of benchmarking. Includes compliance-ready reporting for regulated industries, quality valley warnings for counterproductive memory allocations, instant regression diagnosis tracing issues to specific tensors, and 22 adversarial security probes scanning for injection, leakage, hallucination, and code vulnerabilities.
96
+
97
+
98
+
99
+ Learn more at **[baa.ai](https://baa.ai)** — Sovereign AI.
100
+