mlx-community
/

DeepScaleR-1.5B-6.5bit

Text Generation

text-generation-inference

Model card Files Files and versions

bobig commited on Feb 17, 2025

Commit

aaa8521

·

verified ·

1 Parent(s): eb2c20c

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -15,11 +15,11 @@ tags:
 # bobig/DeepScaleR-1.5B-6.5bit
-This works as a draft model for speculative decoding in [LMstudio 3.10 beta](https://lmstudio.ai/docs/advanced/speculative-decoding)
 Try it with: [mlx-community/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-4.5bit](https://huggingface.co/mlx-community/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-4.5bit)
-you should see 30% faster TPS for math/code prompts even with "thinking"
 The Model [bobig/DeepScaleR-1.5B-6.5bit](https://huggingface.co/bobig/DeepScaleR-1.5B-6.5bit) was
 converted to MLX format from [agentica-org/DeepScaleR-1.5B-Preview](https://huggingface.co/agentica-org/DeepScaleR-1.5B-Preview)

 # bobig/DeepScaleR-1.5B-6.5bit
+This works well as a draft model for speculative decoding in [LMstudio 3.10 beta](https://lmstudio.ai/docs/advanced/speculative-decoding)
 Try it with: [mlx-community/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-4.5bit](https://huggingface.co/mlx-community/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-4.5bit)
+you should see 30% faster TPS for math/code prompts even with "thinking" slowing down the Specultive Decoding
 The Model [bobig/DeepScaleR-1.5B-6.5bit](https://huggingface.co/bobig/DeepScaleR-1.5B-6.5bit) was
 converted to MLX format from [agentica-org/DeepScaleR-1.5B-Preview](https://huggingface.co/agentica-org/DeepScaleR-1.5B-Preview)