Instructions to use sleepy186247/deepseek-coder-33b-instruct-mlx-4Bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use sleepy186247/deepseek-coder-33b-instruct-mlx-4Bit with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir deepseek-coder-33b-instruct-mlx-4Bit sleepy186247/deepseek-coder-33b-instruct-mlx-4Bit
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
Upload folder using huggingface_hub
Browse files- README.md +34 -0
- chat_template.jinja +26 -0
- config.json +38 -0
- generation_config.json +6 -0
- model-00001-of-00004.safetensors +3 -0
- model-00002-of-00004.safetensors +3 -0
- model-00003-of-00004.safetensors +3 -0
- model-00004-of-00004.safetensors +3 -0
- model.safetensors.index.json +0 -0
- tokenizer.json +0 -0
- tokenizer_config.json +14 -0
README.md
ADDED
|
@@ -0,0 +1,34 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: other
|
| 3 |
+
license_name: deepseek
|
| 4 |
+
license_link: LICENSE
|
| 5 |
+
tags:
|
| 6 |
+
- mlx
|
| 7 |
+
base_model: deepseek-ai/deepseek-coder-33b-instruct
|
| 8 |
+
---
|
| 9 |
+
|
| 10 |
+
# sleepy186247/deepseek-coder-33b-instruct-mlx-4Bit
|
| 11 |
+
|
| 12 |
+
The Model [sleepy186247/deepseek-coder-33b-instruct-mlx-4Bit](https://huggingface.co/sleepy186247/deepseek-coder-33b-instruct-mlx-4Bit) was converted to MLX format from [deepseek-ai/deepseek-coder-33b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-33b-instruct) using mlx-lm version **0.31.2**.
|
| 13 |
+
|
| 14 |
+
## Use with mlx
|
| 15 |
+
|
| 16 |
+
```bash
|
| 17 |
+
pip install mlx-lm
|
| 18 |
+
```
|
| 19 |
+
|
| 20 |
+
```python
|
| 21 |
+
from mlx_lm import load, generate
|
| 22 |
+
|
| 23 |
+
model, tokenizer = load("sleepy186247/deepseek-coder-33b-instruct-mlx-4Bit")
|
| 24 |
+
|
| 25 |
+
prompt="hello"
|
| 26 |
+
|
| 27 |
+
if hasattr(tokenizer, "apply_chat_template") and tokenizer.chat_template is not None:
|
| 28 |
+
messages = [{"role": "user", "content": prompt}]
|
| 29 |
+
prompt = tokenizer.apply_chat_template(
|
| 30 |
+
messages, tokenize=False, add_generation_prompt=True
|
| 31 |
+
)
|
| 32 |
+
|
| 33 |
+
response = generate(model, tokenizer, prompt=prompt, verbose=True)
|
| 34 |
+
```
|
chat_template.jinja
ADDED
|
@@ -0,0 +1,26 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{% if not add_generation_prompt is defined %}
|
| 2 |
+
{% set add_generation_prompt = false %}
|
| 3 |
+
{% endif %}
|
| 4 |
+
{%- set ns = namespace(found=false) -%}
|
| 5 |
+
{%- for message in messages -%}
|
| 6 |
+
{%- if message['role'] == 'system' -%}
|
| 7 |
+
{%- set ns.found = true -%}
|
| 8 |
+
{%- endif -%}
|
| 9 |
+
{%- endfor -%}
|
| 10 |
+
{{bos_token}}{%- if not ns.found -%}
|
| 11 |
+
{{'You are an AI programming assistant, utilizing the Deepseek Coder model, developed by Deepseek Company, and you only answer questions related to computer science. For politically sensitive questions, security and privacy issues, and other non-computer science questions, you will refuse to answer\n'}}
|
| 12 |
+
{%- endif %}
|
| 13 |
+
{%- for message in messages %}
|
| 14 |
+
{%- if message['role'] == 'system' %}
|
| 15 |
+
{{ message['content'] }}
|
| 16 |
+
{%- else %}
|
| 17 |
+
{%- if message['role'] == 'user' %}
|
| 18 |
+
{{'### Instruction:\n' + message['content'] + '\n'}}
|
| 19 |
+
{%- else %}
|
| 20 |
+
{{'### Response:\n' + message['content'] + '\n<|EOT|>\n'}}
|
| 21 |
+
{%- endif %}
|
| 22 |
+
{%- endif %}
|
| 23 |
+
{%- endfor %}
|
| 24 |
+
{% if add_generation_prompt %}
|
| 25 |
+
{{'### Response:'}}
|
| 26 |
+
{% endif %}
|
config.json
ADDED
|
@@ -0,0 +1,38 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"architectures": [
|
| 3 |
+
"LlamaForCausalLM"
|
| 4 |
+
],
|
| 5 |
+
"bos_token_id": 32013,
|
| 6 |
+
"eos_token_id": 32021,
|
| 7 |
+
"hidden_act": "silu",
|
| 8 |
+
"hidden_size": 7168,
|
| 9 |
+
"initializer_range": 0.02,
|
| 10 |
+
"intermediate_size": 19200,
|
| 11 |
+
"max_position_embeddings": 16384,
|
| 12 |
+
"model_type": "llama",
|
| 13 |
+
"num_attention_heads": 56,
|
| 14 |
+
"num_hidden_layers": 62,
|
| 15 |
+
"num_key_value_heads": 8,
|
| 16 |
+
"pretraining_tp": 1,
|
| 17 |
+
"quantization": {
|
| 18 |
+
"group_size": 64,
|
| 19 |
+
"bits": 4,
|
| 20 |
+
"mode": "affine"
|
| 21 |
+
},
|
| 22 |
+
"quantization_config": {
|
| 23 |
+
"group_size": 64,
|
| 24 |
+
"bits": 4,
|
| 25 |
+
"mode": "affine"
|
| 26 |
+
},
|
| 27 |
+
"rms_norm_eps": 1e-06,
|
| 28 |
+
"rope_scaling": {
|
| 29 |
+
"factor": 4.0,
|
| 30 |
+
"type": "linear"
|
| 31 |
+
},
|
| 32 |
+
"rope_theta": 100000,
|
| 33 |
+
"tie_word_embeddings": false,
|
| 34 |
+
"torch_dtype": "bfloat16",
|
| 35 |
+
"transformers_version": "4.33.1",
|
| 36 |
+
"use_cache": true,
|
| 37 |
+
"vocab_size": 32256
|
| 38 |
+
}
|
generation_config.json
ADDED
|
@@ -0,0 +1,6 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_from_model_config": true,
|
| 3 |
+
"bos_token_id": 32013,
|
| 4 |
+
"eos_token_id": 32021,
|
| 5 |
+
"transformers_version": "4.34.1"
|
| 6 |
+
}
|
model-00001-of-00004.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f5936e469def9ef6b3a23be68ce8be51c42a35c62bef8618fbbae85956c985ab
|
| 3 |
+
size 5345224438
|
model-00002-of-00004.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:67330ad8ba405aee5d954d9b980c487f338d15f483ca0dd78db851af768d9157
|
| 3 |
+
size 5365725669
|
model-00003-of-00004.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1ce6aab5ad7bbfe08f4356e05a2ae3f593b5dd873c696710fba21c8542d567d7
|
| 3 |
+
size 5365725651
|
model-00004-of-00004.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4c79dc6da7694f3ce0e34784b1f66d487726d10e0eba207d9f4c14e5fcb58972
|
| 3 |
+
size 2680209536
|
model.safetensors.index.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
tokenizer.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
tokenizer_config.json
ADDED
|
@@ -0,0 +1,14 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"add_prefix_space": null,
|
| 3 |
+
"backend": "tokenizers",
|
| 4 |
+
"bos_token": "<|begin▁of▁sentence|>",
|
| 5 |
+
"clean_up_tokenization_spaces": false,
|
| 6 |
+
"eos_token": "<|EOT|>",
|
| 7 |
+
"is_local": true,
|
| 8 |
+
"model_max_length": 16384,
|
| 9 |
+
"pad_token": "<|end▁of▁sentence|>",
|
| 10 |
+
"sp_model_kwargs": {},
|
| 11 |
+
"tokenizer_class": "LlamaTokenizer",
|
| 12 |
+
"unk_token": null,
|
| 13 |
+
"use_default_system_prompt": false
|
| 14 |
+
}
|