Request for MTP version
#24
by Daankular - opened
https://docs.vllm.ai/en/latest/features/speculative_decoding/mtp/
Think this would help a lot for the inference speeds.
https://docs.vllm.ai/en/latest/features/speculative_decoding/mtp/
Think this would help a lot for the inference speeds.