fastllm
/

Kimi-K2-Instruct-INT4MIX

Model card Files Files and versions

fastllm commited on Jul 14, 2025

Commit

f3779cb

·

verified ·

1 Parent(s): df51278

Update README.md

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -22,8 +22,8 @@ pip download fastllm/Kimi-K2-Instruct-INT4MIX
 ``` sh
 # 假设模型下载在 /root/Kimi-K2-Instruct-INT4MIX
-pip run /root/Kimi-K2-Instruct-INT4MIX # 聊天模式
-pip server /root/Kimi-K2-Instruct-INT4MIX # API 服务器模式（默认模型名称 = /root/Kimi-K2-Instruct-INT4MIX，端口 = 8080）
 ```
 # 优化
@@ -36,7 +36,7 @@ pip server /root/Kimi-K2-Instruct-INT4MIX # API 服务器模式（默认模型
 例如：
 ``` sh
-pip server /root/Kimi-K2-Instruct-INT4MIX -t 12
 ```
 ## 多 CPU（多 NUMA 节点）
@@ -77,8 +77,8 @@ pip download fastllm/Kimi-K2-Instruct-INT4MIX
 ``` sh
 # Assuming the model is downloaded in /root/Kimi-K2-Instruct-INT4MIX
-pip run /root/Kimi-K2-Instruct-INT4MIX # chat
-pip server /root/Kimi-K2-Instruct-INT4MIX # api server (default model_name = /root/Kimi-K2-Instruct-INT4MIX, port = 8080)
 ```
 # optimize
@@ -91,7 +91,7 @@ If the speed is extremely slow, it may be due to too many threads—consider red
 for example:
 ``` sh
-pip server /root/Kimi-K2-Instruct-INT4MIX -t 12
 ```
 ## multi cpu (multi numa node)

 ``` sh
 # 假设模型下载在 /root/Kimi-K2-Instruct-INT4MIX
+ftllm run /root/Kimi-K2-Instruct-INT4MIX # 聊天模式
+ftllm server /root/Kimi-K2-Instruct-INT4MIX # API 服务器模式（默认模型名称 = /root/Kimi-K2-Instruct-INT4MIX，端口 = 8080）
 ```
 # 优化
 例如：
 ``` sh
+ftllm server /root/Kimi-K2-Instruct-INT4MIX -t 12
 ```
 ## 多 CPU（多 NUMA 节点）
 ``` sh
 # Assuming the model is downloaded in /root/Kimi-K2-Instruct-INT4MIX
+ftllm run /root/Kimi-K2-Instruct-INT4MIX # chat
+ftllm server /root/Kimi-K2-Instruct-INT4MIX # api server (default model_name = /root/Kimi-K2-Instruct-INT4MIX, port = 8080)
 ```
 # optimize
 for example:
 ``` sh
+ftllm server /root/Kimi-K2-Instruct-INT4MIX -t 12
 ```
 ## multi cpu (multi numa node)