Text Generation
Transformers
Safetensors
deepseek_v2
conversational
custom_code
text-generation-inference
Instructions to use deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct
- SGLang
How to use deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct with Docker Model Runner:
docker model run hf.co/deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct
Weights download timing out
#11
by shoang - opened
Hello,
Thank you for releasing this model. I'm trying to download it to use with Huggingface's TGI, but I'm getting timeouts, specifically for model-00002-of-000004.safetensors:
2024-09-22 10:08:31 2024-09-22T14:08:31.371971Z INFO text_generation_launcher: Download file: model-00002-of-000004.safetensors
2024-09-22 10:09:11 2024-09-22T14:09:11.939903Z ERROR download: text_generation_launcher: Download encountered an error:
2024-09-22 10:09:11 2024-09-22 14:02:50.330 | INFO | text_generation_server.utils.import_utils:<module>:75 - Detected system cuda
2024-09-22 10:09:11 Traceback (most recent call last):
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/urllib3/connection.py", line 196, in _new_conn
2024-09-22 10:09:11 sock = connection.create_connection(
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
2024-09-22 10:09:11 raise err
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection
2024-09-22 10:09:11 sock.connect(sa)
2024-09-22 10:09:11
2024-09-22 10:09:11 TimeoutError: timed out
2024-09-22 10:09:11
2024-09-22 10:09:11
2024-09-22 10:09:11 The above exception was the direct cause of the following exception:
2024-09-22 10:09:11
2024-09-22 10:09:11
2024-09-22 10:09:11 Traceback (most recent call last):
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py", line 789, in urlopen
2024-09-22 10:09:11 response = self._make_request(
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py", line 490, in _make_request
2024-09-22 10:09:11 raise new_e
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py", line 466, in _make_request
2024-09-22 10:09:11 self._validate_conn(conn)
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1095, in _validate_conn
2024-09-22 10:09:11 conn.connect()
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/urllib3/connection.py", line 615, in connect
2024-09-22 10:09:11 self.sock = sock = self._new_conn()
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/urllib3/connection.py", line 205, in _new_conn
2024-09-22 10:09:11 raise ConnectTimeoutError(
2024-09-22 10:09:11
2024-09-22 10:09:11 urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPSConnection object at 0x7f2a1828bf10>, 'Connection to cdn-lfs-us-1.huggingface.co timed out. (connect timeout=10)')
2024-09-22 10:09:11
2024-09-22 10:09:11
2024-09-22 10:09:11 The above exception was the direct cause of the following exception:
2024-09-22 10:09:11
2024-09-22 10:09:11
2024-09-22 10:09:11 Traceback (most recent call last):
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/requests/adapters.py", line 667, in send
2024-09-22 10:09:11 resp = conn.urlopen(
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py", line 843, in urlopen
2024-09-22 10:09:11 retries = retries.increment(
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/urllib3/util/retry.py", line 519, in increment
2024-09-22 10:09:11 raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
2024-09-22 10:09:11
2024-09-22 10:09:11 urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='cdn-lfs-us-1.huggingface.co', port=443): Max retries exceeded with url: /repos/52/1a/521a03f0138aeb342a78392fe59380bb4d231b66fb1129c8de47406f69b8dcea/7bf22dfa271527f7a0b8dbd56592722cd8fdcfeb6aad32ebb1110d21882eb1d8?response-content-disposition=inline%3B+filename*%3DUTF-8%27%27model-00002-of-000004.safetensors%3B+filename%3D%22model-00002-of-000004.safetensors%22%3B&Expires=1727273311&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcyNzI3MzMxMX19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmh1Z2dpbmdmYWNlLmNvL3JlcG9zLzUyLzFhLzUyMWEwM2YwMTM4YWViMzQyYTc4MzkyZmU1OTM4MGJiNGQyMzFiNjZmYjExMjljOGRlNDc0MDZmNjliOGRjZWEvN2JmMjJkZmEyNzE1MjdmN2EwYjhkYmQ1NjU5MjcyMmNkOGZkY2ZlYjZhYWQzMmViYjExMTBkMjE4ODJlYjFkOD9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSoifV19&Signature=KryNhe8PCNtgnZKofTQU5Ho09c4Z31h~2UmbP5C2fwTzWv8bawKuucyIYSwU2XBUoHInukjhKBLQWePd-p2SWrV0s5~oheAgfm9zys18gOuSDo1229YXbIZgZ-DsoU5n2eZNwPIPZ0Fae5~DTNBf5WR~WCetpeSlvt-sl6k-2tso2LclXQMroN~Q9HAYD8vi1tSHibXnTvrNEOlU8DPbyjwvjQH90lFyIP5J2iKf-L4GekWpYIIZ33Xx3L06iGM5mw90LqgeMkbhSwuVaZ437~FUD6CQ3OzYBfbIFHZhN-YDpoq2aevWBAcI6aaImGQhiV75nb~6mcS~mTNt1yr~~A__&Key-Pair-Id=K24J24Z295AEI9 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f2a1828bf10>, 'Connection to cdn-lfs-us-1.huggingface.co timed out. (connect timeout=10)'))
2024-09-22 10:09:11
2024-09-22 10:09:11
2024-09-22 10:09:11 During handling of the above exception, another exception occurred:
2024-09-22 10:09:11
2024-09-22 10:09:11
2024-09-22 10:09:11 Traceback (most recent call last):
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/bin/text-generation-server", line 8, in <module>
2024-09-22 10:09:11 sys.exit(app())
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 225, in download_weights
2024-09-22 10:09:11 utils.download_weights(filenames, model_id, revision)
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/hub.py", line 264, in download_weights
2024-09-22 10:09:11 file = download_file(filename)
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/hub.py", line 254, in download_file
2024-09-22 10:09:11 raise e
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/hub.py", line 242, in download_file
2024-09-22 10:09:11 local_file = hf_hub_download(
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
2024-09-22 10:09:11 return fn(*args, **kwargs)
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1221, in hf_hub_download
2024-09-22 10:09:11 return _hf_hub_download_to_cache_dir(
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1367, in _hf_hub_download_to_cache_dir
2024-09-22 10:09:11 _download_to_tmp_and_move(
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1884, in _download_to_tmp_and_move
2024-09-22 10:09:11 http_get(
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 459, in http_get
2024-09-22 10:09:11 r = _request_wrapper(
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 395, in _request_wrapper
2024-09-22 10:09:11 response = get_session().request(method=method, url=url, **params)
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
2024-09-22 10:09:11 resp = self.send(prep, **send_kwargs)
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
2024-09-22 10:09:11 r = adapter.send(request, **kwargs)
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 66, in send
2024-09-22 10:09:11 return super().send(request, *args, **kwargs)
2024-09-22 10:09:11
2024-09-22 10:09:11 File "/opt/conda/lib/python3.10/site-packages/requests/adapters.py", line 688, in send
2024-09-22 10:09:11 raise ConnectTimeout(e, request=request)
2024-09-22 10:09:11
2024-09-22 10:09:11 requests.exceptions.ConnectTimeout: (MaxRetryError("HTTPSConnectionPool(host='cdn-lfs-us-1.huggingface.co', port=443): Max retries exceeded with url: /repos/52/1a/521a03f0138aeb342a78392fe59380bb4d231b66fb1129c8de47406f69b8dcea/7bf22dfa271527f7a0b8dbd56592722cd8fdcfeb6aad32ebb1110d21882eb1d8?response-content-disposition=inline%3B+filename*%3DUTF-8%27%27model-00002-of-000004.safetensors%3B+filename%3D%22model-00002-of-000004.safetensors%22%3B&Expires=1727273311&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcyNzI3MzMxMX19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmh1Z2dpbmdmYWNlLmNvL3JlcG9zLzUyLzFhLzUyMWEwM2YwMTM4YWViMzQyYTc4MzkyZmU1OTM4MGJiNGQyMzFiNjZmYjExMjljOGRlNDc0MDZmNjliOGRjZWEvN2JmMjJkZmEyNzE1MjdmN2EwYjhkYmQ1NjU5MjcyMmNkOGZkY2ZlYjZhYWQzMmViYjExMTBkMjE4ODJlYjFkOD9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSoifV19&Signature=KryNhe8PCNtgnZKofTQU5Ho09c4Z31h~2UmbP5C2fwTzWv8bawKuucyIYSwU2XBUoHInukjhKBLQWePd-p2SWrV0s5~oheAgfm9zys18gOuSDo1229YXbIZgZ-DsoU5n2eZNwPIPZ0Fae5~DTNBf5WR~WCetpeSlvt-sl6k-2tso2LclXQMroN~Q9HAYD8vi1tSHibXnTvrNEOlU8DPbyjwvjQH90lFyIP5J2iKf-L4GekWpYIIZ33Xx3L06iGM5mw90LqgeMkbhSwuVaZ437~FUD6CQ3OzYBfbIFHZhN-YDpoq2aevWBAcI6aaImGQhiV75nb~6mcS~mTNt1yr~~A__&Key-Pair-Id=K24J24Z295AEI9 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f2a1828bf10>, 'Connection to cdn-lfs-us-1.huggingface.co timed out. (connect timeout=10)'))"), '(Request ID: 629dc7e7-2b4a-493f-962c-7f6485efed78)')
2024-09-22 10:09:11
2024-09-22 10:09:11 Error: DownloadError