Instructions to use codefuse-ai/CodeFuse-DeepSeek-33B-4bits with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use codefuse-ai/CodeFuse-DeepSeek-33B-4bits with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="codefuse-ai/CodeFuse-DeepSeek-33B-4bits")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("codefuse-ai/CodeFuse-DeepSeek-33B-4bits") model = AutoModelForCausalLM.from_pretrained("codefuse-ai/CodeFuse-DeepSeek-33B-4bits") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use codefuse-ai/CodeFuse-DeepSeek-33B-4bits with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "codefuse-ai/CodeFuse-DeepSeek-33B-4bits" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "codefuse-ai/CodeFuse-DeepSeek-33B-4bits", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/codefuse-ai/CodeFuse-DeepSeek-33B-4bits
- SGLang
How to use codefuse-ai/CodeFuse-DeepSeek-33B-4bits with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "codefuse-ai/CodeFuse-DeepSeek-33B-4bits" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "codefuse-ai/CodeFuse-DeepSeek-33B-4bits", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "codefuse-ai/CodeFuse-DeepSeek-33B-4bits" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "codefuse-ai/CodeFuse-DeepSeek-33B-4bits", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use codefuse-ai/CodeFuse-DeepSeek-33B-4bits with Docker Model Runner:
docker model run hf.co/codefuse-ai/CodeFuse-DeepSeek-33B-4bits
Update README.md
Browse files
README.md
CHANGED
|
@@ -14,7 +14,7 @@ tasks:
|
|
| 14 |
|
| 15 |
## Model Description
|
| 16 |
|
| 17 |
-
CodeFuse-DeepSeek-33B-4bits is the 4-bit quantized version of [CodeFuse-DeepSeek-33B](https://
|
| 18 |
|
| 19 |
After undergoing 4-bit quantization, the CodeFuse-DeepSeek-33B-4bits model can be loaded on either a single A10 (24GB VRAM) or an RTX 4090 (24GB VRAM). Moreover, the quantized model still achives an impressive accuracy of 78.05% on the Humaneval pass@1 metric.
|
| 20 |
|
|
@@ -34,9 +34,9 @@ After undergoing 4-bit quantization, the CodeFuse-DeepSeek-33B-4bits model can b
|
|
| 34 |
|
| 35 |
🔥🔥 2023-09-27 CodeFuse-StarCoder-15B has been released, achieving a pass@1 (greedy decoding) score of 54.9% on HumanEval, which is a 21% increase compared to StarCoder's 33.6%.
|
| 36 |
|
| 37 |
-
🔥🔥🔥 2023-09-26 We are pleased to announce the release of the [4-bit quantized version](https://
|
| 38 |
|
| 39 |
-
🔥🔥🔥 2023-09-11 [CodeFuse-CodeLlama34B](https://
|
| 40 |
|
| 41 |
<br>
|
| 42 |
|
|
@@ -209,7 +209,7 @@ if __name__ == "__main__":
|
|
| 209 |
|
| 210 |
## 模型简介
|
| 211 |
|
| 212 |
-
CodeFuse-DeepSeek-33B-4bits是代码大模型[CodeFuse-DeepSeek-33B](https://
|
| 213 |
|
| 214 |
经过4-bits量化后,CodeFuse-DeepSeek-33B-4bits可在单张A10 (24GB显存)或者RTX 4090(24G显存)上加载。量化后,CodeFuse-DeepSeek-33B-4bits仍取得HumanEval pass@1 78.05%。
|
| 215 |
<br>
|
|
@@ -228,9 +228,9 @@ CodeFuse-DeepSeek-33B-4bits是代码大模型[CodeFuse-DeepSeek-33B](https://mod
|
|
| 228 |
|
| 229 |
🔥🔥 2023-09-27开源了CodeFuse-StarCoder-15B模型,在HumanEval pass@1(greedy decoding)上可以达到54.9%, 比StarCoder提高了21%的代码能力(HumanEval)
|
| 230 |
|
| 231 |
-
🔥🔥🔥 2023-09-26 [CodeFuse-CodeLlama-34B 4bits](https://
|
| 232 |
|
| 233 |
-
🔥🔥🔥 2023-09-11 [CodeFuse-CodeLlama-34B](https://
|
| 234 |
|
| 235 |
<br>
|
| 236 |
|
|
|
|
| 14 |
|
| 15 |
## Model Description
|
| 16 |
|
| 17 |
+
CodeFuse-DeepSeek-33B-4bits is the 4-bit quantized version of [CodeFuse-DeepSeek-33B](https://huggingface.co/codefuse-ai/CodeFuse-DeepSeek-33B) which is a 33B Code-LLM finetuned by QLoRA on multiple code-related tasks on the base model DeepSeek-Coder-33B.
|
| 18 |
|
| 19 |
After undergoing 4-bit quantization, the CodeFuse-DeepSeek-33B-4bits model can be loaded on either a single A10 (24GB VRAM) or an RTX 4090 (24GB VRAM). Moreover, the quantized model still achives an impressive accuracy of 78.05% on the Humaneval pass@1 metric.
|
| 20 |
|
|
|
|
| 34 |
|
| 35 |
🔥🔥 2023-09-27 CodeFuse-StarCoder-15B has been released, achieving a pass@1 (greedy decoding) score of 54.9% on HumanEval, which is a 21% increase compared to StarCoder's 33.6%.
|
| 36 |
|
| 37 |
+
🔥🔥🔥 2023-09-26 We are pleased to announce the release of the [4-bit quantized version](https://huggingface.co/codefuse-ai/CodeFuse-CodeLlama-34B-4bits) of [CodeFuse-CodeLlama-34B](https://modelscope.cn/models/codefuse-ai/CodeFuse-CodeLlama-34B/summary). Despite the quantization process, the model still achieves a remarkable 73.8% accuracy (greedy decoding) on the HumanEval pass@1 metric.
|
| 38 |
|
| 39 |
+
🔥🔥🔥 2023-09-11 [CodeFuse-CodeLlama34B](https://huggingface.co/codefuse-ai/CodeFuse-CodeLlama-34B-4bits) has achieved 74.4% of pass@1 (greedy decoding) on HumanEval, which is SOTA results for openspurced LLMs at present.
|
| 40 |
|
| 41 |
<br>
|
| 42 |
|
|
|
|
| 209 |
|
| 210 |
## 模型简介
|
| 211 |
|
| 212 |
+
CodeFuse-DeepSeek-33B-4bits是代码大模型[CodeFuse-DeepSeek-33B](https://huggingface.co/codefuse-ai/CodeFuse-DeepSeek-33B)的4-bits量化版本,后者基于底座模型DeepSeek-Coder-33B使用MFTCoder框架在多个代码相关任务上微调得到。
|
| 213 |
|
| 214 |
经过4-bits量化后,CodeFuse-DeepSeek-33B-4bits可在单张A10 (24GB显存)或者RTX 4090(24G显存)上加载。量化后,CodeFuse-DeepSeek-33B-4bits仍取得HumanEval pass@1 78.05%。
|
| 215 |
<br>
|
|
|
|
| 228 |
|
| 229 |
🔥🔥 2023-09-27开源了CodeFuse-StarCoder-15B模型,在HumanEval pass@1(greedy decoding)上可以达到54.9%, 比StarCoder提高了21%的代码能力(HumanEval)
|
| 230 |
|
| 231 |
+
🔥🔥🔥 2023-09-26 [CodeFuse-CodeLlama-34B 4bits](https://huggingface.co/codefuse-ai/CodeFuse-CodeLlama-34B-4bits)量化版本发布,量化后模型在HumanEval pass@1指标为73.8% (贪婪解码)。
|
| 232 |
|
| 233 |
+
🔥🔥🔥 2023-09-11 [CodeFuse-CodeLlama-34B](https://huggingface.co/codefuse-ai/CodeFuse-CodeLlama-34B)发布,HumanEval pass@1指标达到74.4% (贪婪解码), 为当前开源SOTA。
|
| 234 |
|
| 235 |
<br>
|
| 236 |
|