Instructions to use mlx-community/DeepSeek-R1-Distill-Qwen-14B-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mlx-community/DeepSeek-R1-Distill-Qwen-14B-4bit with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir DeepSeek-R1-Distill-Qwen-14B-4bit mlx-community/DeepSeek-R1-Distill-Qwen-14B-4bit
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
Model does not think in <think> tabs and therefore UI does not work in LM Studio
Model does not think in tabs and therefore UI does not work in LM Studio
I encountered the same issue with the model's reasoning/think block. I investigated the tokenizer_config.json and believe I found the specific cause related to your observation.
The chat_template in tokenizer_config.json has this line at the end:
{% if add_generation_prompt and not ns.is_tool %}{{'<|Assistant|>< think>\n'}}{% endif %}
It seems this pre-added tag and potentially the newline character (\n) interfere with how the model's actual < think>... block is generated or formatted in the output. This likely leads to the parsing failures you mentioned, such as the UI issues in LM Studio.
I tested this by modifying my chat_template to remove the < think>\n addition from that specific if condition. With that change, the model generated its < think>...< /think> block correctly as part of its output.
{% if add_generation_prompt and not ns.is_tool %}{{'<|Assistant|>'}}{% endif %}
Hopefully, this helps the maintainers fix the template.
{% if add_generation_prompt and not ns.is_tool %}{{'<|Assistant|>'}}{% endif %}
The change fixes the issue for me.
LM Studio 0.3.14
LM Studio MLX v0.12.1