SDSAT: Accelerating LLM Inference through Speculative Decoding with Semantic Adaptive Tokens
Paper
• 2403.18647 • Published
The 13B model of "SDSAT: Accelerating LLM Inference through Speculative Decoding with Semantic Adaptive Tokens"