CPU-Hybrid-MoE
/

MiniMax-M2.5-CPU-NUMA4-AMXINT8

Text Generation

Model card Files Files and versions

Doctor-Shotgun commited on Mar 14

Commit

13300b2

·

verified ·

1 Parent(s): 1e0f8f8

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -10,7 +10,7 @@ pipeline_tag: text-generation
 To run, please ensure that your CPU supports the AMX instruction set (Intel Xeon processor, Sapphire Rapids or newer), and make note of your NUMA node count. Install `kt-kernal` and `sglang-kt` following the [official documentation](https://github.com/kvcache-ai/ktransformers/blob/main/kt-kernel/README.md).
-Then, download both the FP8 official weights of [MiniMaxAI/MiniMax-M2.5](https://huggingface.co/MiniMaxAI/MiniMax-M2.5), as well as this CPU-optimized quantized model, and prepare your launch command:
 ```
 PYTORCH_ALLOC_CONF=expandable_segments:True \

 To run, please ensure that your CPU supports the AMX instruction set (Intel Xeon processor, Sapphire Rapids or newer), and make note of your NUMA node count. Install `kt-kernal` and `sglang-kt` following the [official documentation](https://github.com/kvcache-ai/ktransformers/blob/main/kt-kernel/README.md).
+Then, download the official weights of MiniMaxAI/MiniMax-M2.5 in [FP8](https://huggingface.co/MiniMaxAI/MiniMax-M2.5), as well as this CPU-optimized quantized model, and prepare your launch command:
 ```
 PYTORCH_ALLOC_CONF=expandable_segments:True \