Instructions to use moonshotai/Kimi-K2.5 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use moonshotai/Kimi-K2.5 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="moonshotai/Kimi-K2.5", trust_remote_code=True)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("moonshotai/Kimi-K2.5", trust_remote_code=True, dtype="auto")

Inference
HuggingChat
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use moonshotai/Kimi-K2.5 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "moonshotai/Kimi-K2.5"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "moonshotai/Kimi-K2.5",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/moonshotai/Kimi-K2.5

SGLang

How to use moonshotai/Kimi-K2.5 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "moonshotai/Kimi-K2.5" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "moonshotai/Kimi-K2.5",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "moonshotai/Kimi-K2.5" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "moonshotai/Kimi-K2.5",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use moonshotai/Kimi-K2.5 with Docker Model Runner:
```
docker model run hf.co/moonshotai/Kimi-K2.5
```

多图输入的占位符或顺序？

#63

by noobimp - opened Feb 5

Discussion

noobimp

Feb 5

•

edited Feb 5

Hi，想请问一下，类似MMMU-Pro的多图输入应该如何组织prompt，用< image >占位并按顺序传图片就可以吗？感谢~

crypthine

Feb 5

teowu

Moonshot AI org Feb 6

我们测试了很多种方式，MMMU-Pro的性能都比较稳定在report分数，方差较低。

图片的插入方式：

不建议采取 model judge 的方式!

把所有图统一放到最前，后边按照 image1, 2,3,4 来占位
把图插入到 placeholder对应的位置

测试的prompt：

不建议采取 model judge 的方式，可能会受 judge model 倾向的影响

使用MMMU-Pro官方的测试prompt，加对应的regex提取
参考MMLU-Pro官方测试prompt，加对应的regex提取

可以参考这个code：https://github.com/MoonshotAI/Kimi-Vendor-Verifier/blob/main/mmmu_pro_vision.py，对应第一种图片插入方式和第一种测试prompt。

noobimp

Feb 6

我们测试了很多种方式，MMMU-Pro的性能都比较稳定在report分数，方差较低。

图片的插入方式：

不建议采取 model judge 的方式!

把所有图统一放到最前，后边按照 image1, 2,3,4 来占位

把图插入到 placeholder对应的位置

测试的prompt：

不建议采取 model judge 的方式，可能会受 judge model 倾向的影响

使用MMMU-Pro官方的测试prompt，加对应的regex提取

参考MMLU-Pro官方测试prompt，加对应的regex提取

可以参考这个code：https://github.com/MoonshotAI/Kimi-Vendor-Verifier/blob/main/mmmu_pro_vision.py，对应第一种图片插入方式和第一种测试prompt。

感谢您的回复，我基本沿用了MMMU-Pro官方的code，按顺序组织图片，先传图再传文本，占位符只用了image，没有标1, 2, 3, 4；按规则抽取&评估答案，性能也基本一致。

再次感谢~

Coobiw

Feb 6

•

edited Feb 6

可以参考官方的这两种setting https://github.com/MMMU-Benchmark/MMMU/tree/main/mmmu-pro

两种性能几乎一致，建议用第二种，更方便

noobimp

2 days ago

想再请问下对于Video-MME，VideoMMMU这类数据，评测的setting是什么样的呢，比如抽帧数量/fps，每帧分辨率，是否进行缩放；我在使用Video-MME测试1fps抽帧不缩放的setting下，会超256k上下文。谢谢~

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment