Is the checkpoint correct?
Hello great contributors! Thanks for your model!
Lately I was trying to deploy the K2.5 Eagle3 draft model on vLLM, however the accept rate is lower than your report β even lower than K2 Eagle3 in my experiments. I saw similar results on both vLLM and the SGLang PR, so it seems that people are having the same problems as me.
Are you sure the checkpoints are correct? Or could you share the steps to reproduce your results on either vLLM or SGLang?
Great thanks!
Kimi-K25 SGLang Benchmarking and Deployment
This repository (or document) outlines the setup and benchmarking details for deploying the Kimi-K25 model using sglang on H200 GPUs.
Overview
Our experiments and benchmarks for the Kimi-K25 model were conducted exclusively on H200 GPUs using sglang version 0.5.9. We did not perform any tests or comparisons with vLLM.
Deployment
To launch the sglang service for Kimi-K25 for experimentation and benchmarking, use the following command:
Launch Service Command:
python3 -m sglang.launch_server \
--model-path /models/Kimi-K25 \
--host 0.0.0.0 --port 30012 \
--trust-remote-code \
--mem-fraction-static 0.9 \
--tp-size 8 \
--speculative-algorithm EAGLE3 \
--speculative-draft-model-path AQ-MedAI/Kimi-K25-eagle3 \
--speculative-num-steps 3 \
--speculative-eagle-topk 1 \
--speculative-num-draft-tokens 4
**Benchmarking:**
For pressure testing and benchmarking the deployed sglang service, please refer to the SpecForge benchmarking suite:
Benchmarking Script Repository:
https://github.com/sgl-project/SpecForge/tree/main/benchmarks/benchmarker
Btw, the AVL for GSM8K with the K2 Eagle3 draft model is 3.165. What was the --speculative-num-steps setting for that? Presumably 3?
If I'am all correct, the accept rate for K2.5 is lower than K2?
En, K25-eagle3 version has been further trained on both Chinese and English datasets, while K2-eagle3 was trained on purely English data. Therefore, K25 might handle longer Chinese inputs better. However, K25-eagle3 will continue to be optimized and updated over time.
Thanks again, looking forward to it!
