bobig commited on
Commit
aaa8521
·
verified ·
1 Parent(s): eb2c20c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -15,11 +15,11 @@ tags:
15
 
16
  # bobig/DeepScaleR-1.5B-6.5bit
17
 
18
- This works as a draft model for speculative decoding in [LMstudio 3.10 beta](https://lmstudio.ai/docs/advanced/speculative-decoding)
19
 
20
  Try it with: [mlx-community/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-4.5bit](https://huggingface.co/mlx-community/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-4.5bit)
21
 
22
- you should see 30% faster TPS for math/code prompts even with "thinking"
23
 
24
  The Model [bobig/DeepScaleR-1.5B-6.5bit](https://huggingface.co/bobig/DeepScaleR-1.5B-6.5bit) was
25
  converted to MLX format from [agentica-org/DeepScaleR-1.5B-Preview](https://huggingface.co/agentica-org/DeepScaleR-1.5B-Preview)
 
15
 
16
  # bobig/DeepScaleR-1.5B-6.5bit
17
 
18
+ This works well as a draft model for speculative decoding in [LMstudio 3.10 beta](https://lmstudio.ai/docs/advanced/speculative-decoding)
19
 
20
  Try it with: [mlx-community/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-4.5bit](https://huggingface.co/mlx-community/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-4.5bit)
21
 
22
+ you should see 30% faster TPS for math/code prompts even with "thinking" slowing down the Specultive Decoding
23
 
24
  The Model [bobig/DeepScaleR-1.5B-6.5bit](https://huggingface.co/bobig/DeepScaleR-1.5B-6.5bit) was
25
  converted to MLX format from [agentica-org/DeepScaleR-1.5B-Preview](https://huggingface.co/agentica-org/DeepScaleR-1.5B-Preview)