moonshotai
/

Kimi-K2-Instruct

Text Generation

Model card Files Files and versions

Instructions to use moonshotai/Kimi-K2-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use moonshotai/Kimi-K2-Instruct with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="moonshotai/Kimi-K2-Instruct", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("moonshotai/Kimi-K2-Instruct", trust_remote_code=True, dtype="auto")

Inference
HuggingChat
Notebooks
Google Colab
Kaggle
Local Apps

How to use moonshotai/Kimi-K2-Instruct with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "moonshotai/Kimi-K2-Instruct"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "moonshotai/Kimi-K2-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/moonshotai/Kimi-K2-Instruct

How to use moonshotai/Kimi-K2-Instruct with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "moonshotai/Kimi-K2-Instruct" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "moonshotai/Kimi-K2-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "moonshotai/Kimi-K2-Instruct" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "moonshotai/Kimi-K2-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use moonshotai/Kimi-K2-Instruct with Docker Model Runner:
```
docker model run hf.co/moonshotai/Kimi-K2-Instruct
```

Kimi-K2-Instruct

Commit History

Add Terminal-Bench evaluation result (27.8%)

b3a893b
verified

burtenshaw HF Staff commited on 27 days ago

Add Terminal-Bench evaluation result (27.8%) (#64)

fd1984e

burtenshaw HF Staff commited on Apr 23

Add SWE-Bench Pro evaluation results (#65)

4bbe370

nielsr HF Staff commited on Apr 23

Create .eval_results/apex-swe.yaml (#67)

e6e15d4

madhavan113 commited on Apr 23

Transformers v5 support (#62)

1cbe779
verified

hmellor HF Staff commited on Jan 30

Adjust number of reserved tokens to match the model (#15)

c2fee60
verified

dzhulgakov commited on Nov 7, 2025

Update tokenization_kimi.py (#56)

49e03b0
verified

lfu commited on Nov 4, 2025

remove-added-functions.-in-tool_call_id (#60)

0102674
verified

bigmoyan commited on Oct 22, 2025

fix-apply-chat-template (#59)

cc61331
verified

bigmoyan commited on Oct 22, 2025

update chat-template and tokenizer (#58)

874f2bb
verified

bigmoyan commited on Oct 10, 2025

Add `new_version` tag with the updated model (#57)

2a19363
verified

multimodalart HF Staff commited on Sep 5, 2025

Update chat_template.jinja (#49)

c52f808
verified

bigeagle commited on Aug 11, 2025

Update README.md

c499175
verified

bigmoyan commited on Aug 11, 2025

standalone chat template (#47)

b215bf7
verified

bigmoyan commited on Aug 11, 2025

fix typo (#33)

0826e83
verified

lkm2835 commited on Jul 28, 2025

Updated Readme with released Technical Report link (#35)

6129c6d
verified

casinca commited on Jul 22, 2025

fix typos

d2513b8

wangzhengtao commited on Jul 21, 2025

Update README.md

4f23950
verified

bigmoyan commited on Jul 18, 2025

update chat template

6a1f5e6

wangzhengtao commited on Jul 18, 2025

Update README.md

d1e2b19
verified

bigmoyan commited on Jul 15, 2025

update tokenizer_config

37a95d8

wangzhengtao commited on Jul 15, 2025

Update tokenizer_config.json (#13)

6be65c0
verified

bchenfireworks commited on Jul 15, 2025

Update tokenizer_config.json (#13)

1221eb6
verified

bchenfireworks commited on Jul 15, 2025

fix: allow tokenize special tokens

02de71c

wangzhengtao commited on Jul 15, 2025

comming soon -> coming soon (#12)

841a184
verified

rasbt commited on Jul 14, 2025

add github link in readme

23bdc5f

liushaowei commited on Jul 13, 2025

update third party notices

2f7e011

liushaowei commited on Jul 12, 2025

update banner

c9a3d34

liushaowei commited on Jul 11, 2025

Upload folder using huggingface_hub

b543046
verified

lsw825 commited on Jul 11, 2025

update readme

7f98307

liushaowei commited on Jul 11, 2025

Update docs/deploy_guidance.md

2bfbc7b
verified

lsw825 commited on Jul 11, 2025

Update README.md

c46f181
verified

bigeagle commited on Jul 11, 2025

Update README.md

dc4be28
verified

lsw825 commited on Jul 11, 2025

Upload folder using huggingface_hub

b51274b
verified

lsw825 commited on Jul 11, 2025

Upload folder using huggingface_hub

8a112f5
verified

lsw825 commited on Jul 11, 2025

Add files using upload-large-folder tool

22bf139
verified

lsw825 commited on Jul 11, 2025

initial commit

37b5512
verified

lsw825 commited on Jul 11, 2025