mratsim commited on
Commit
7a81459
·
verified ·
1 Parent(s): 32e1ea0

Point to M2.5 not M2.1

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -71,7 +71,7 @@ base_model:
71
  - MiniMaxAI/MiniMax-M2.5
72
  ---
73
 
74
- # MiniMax M2.1 (Mixed-Precision BF16 + INT4 AWQ)
75
 
76
  ## Changelog
77
 
@@ -83,7 +83,7 @@ base_model:
83
  This strives to be the highest quality quant that can run on 192GiB VRAM
84
 
85
  > [!TIP]
86
- > 💡This is a sister model to [mratsim/MiniMax-M2.5-FP8-INT4-AWQ](https://huggingface.co/mratsim/MiniMax-M2.1-FP8-INT4-AWQ)
87
  > with the original model FP8 weights pre-dequantized to BF16.
88
  >
89
  > This makes it compatible with 8x3090 systems (which don't have hardware FP8)
@@ -139,7 +139,7 @@ It uses my new declarative quantization framework https://github.com/mratsim/qua
139
 
140
  The model was tested with SGLang + 2x RTX Pro 6000, here is a script suitable for such configuration with the maximum 196,608 context length. This uses 92.5GiB of VRAM with the flashinfer backend.
141
 
142
- Please refer to [mratsim/MiniMax-M2.5-FP8-INT4-AWQ#running-script](https://huggingface.co/mratsim/MiniMax-M2.1-FP8-INT4-AWQ#running-script)
143
  for running it in vLLM
144
 
145
  ### Running script
 
71
  - MiniMaxAI/MiniMax-M2.5
72
  ---
73
 
74
+ # MiniMax M2.5 (Mixed-Precision BF16 + INT4 AWQ)
75
 
76
  ## Changelog
77
 
 
83
  This strives to be the highest quality quant that can run on 192GiB VRAM
84
 
85
  > [!TIP]
86
+ > 💡This is a sister model to [mratsim/MiniMax-M2.5-FP8-INT4-AWQ](https://huggingface.co/mratsim/MiniMax-M2.5-FP8-INT4-AWQ)
87
  > with the original model FP8 weights pre-dequantized to BF16.
88
  >
89
  > This makes it compatible with 8x3090 systems (which don't have hardware FP8)
 
139
 
140
  The model was tested with SGLang + 2x RTX Pro 6000, here is a script suitable for such configuration with the maximum 196,608 context length. This uses 92.5GiB of VRAM with the flashinfer backend.
141
 
142
+ Please refer to [mratsim/MiniMax-M2.5-FP8-INT4-AWQ#running-script](https://huggingface.co/mratsim/MiniMax-M2.5-FP8-INT4-AWQ#running-script)
143
  for running it in vLLM
144
 
145
  ### Running script