Instructions to use codefuse-ai/CodeFuse-Mixtral-8x7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use codefuse-ai/CodeFuse-Mixtral-8x7B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="codefuse-ai/CodeFuse-Mixtral-8x7B")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("codefuse-ai/CodeFuse-Mixtral-8x7B")
model = AutoModelForCausalLM.from_pretrained("codefuse-ai/CodeFuse-Mixtral-8x7B")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use codefuse-ai/CodeFuse-Mixtral-8x7B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "codefuse-ai/CodeFuse-Mixtral-8x7B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "codefuse-ai/CodeFuse-Mixtral-8x7B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/codefuse-ai/CodeFuse-Mixtral-8x7B

SGLang

How to use codefuse-ai/CodeFuse-Mixtral-8x7B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "codefuse-ai/CodeFuse-Mixtral-8x7B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "codefuse-ai/CodeFuse-Mixtral-8x7B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "codefuse-ai/CodeFuse-Mixtral-8x7B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "codefuse-ai/CodeFuse-Mixtral-8x7B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use codefuse-ai/CodeFuse-Mixtral-8x7B with Docker Model Runner:
```
docker model run hf.co/codefuse-ai/CodeFuse-Mixtral-8x7B
```

chencyudel commited on Jan 16, 2024

Commit

3d5d811

verified ·

1 Parent(s): 22172bc

Update README.md

Browse files

Files changed (1) hide show

README.md +8 -7

README.md CHANGED Viewed

@@ -248,8 +248,13 @@ CodeFuse-DeepSeek-33B 是一个通过QLoRA对基座模型DeepSeek-Coder-33B进
 | 模型                          | HumanEval(pass@1) |   日期    |
 |:----------------------------|:-----------------:|:-------:|
 | **CodeFuse-CodeLlama-34B**  |     74.4%      | 2023.9  |
 |**CodeFuse-CodeLlama-34B-4bits** |     73.8%  |  2023.9 |
 | WizardCoder-Python-34B-V1.0 |       73.2%       | 2023.8  |
 | GPT-4(zero-shot)            |       67.0%       | 2023.3  |
 | PanGu-Coder2 15B            |       61.6%       | 2023.8  |
@@ -258,11 +263,7 @@ CodeFuse-DeepSeek-33B 是一个通过QLoRA对基座模型DeepSeek-Coder-33B进
 | GPT-3.5(zero-shot)          |       48.1%       | 2022.11 |
 | OctoCoder                   |       46.2%       | 2023.8  |
 | StarCoder-15B               |       33.6%       | 2023.5  |
-| Qwen-14b               |       32.3%       | 2023.10  |
-| **CodeFuse-StarCoder-15B**  |     54.9%     | 2023.9  |
-| **CodeFuse-QWen-14B**       |     48.78%     | 2023.8 |
-| **CodeFuse-CodeGeeX2-6B**   |     45.12%    | 2023.11 |
-| **CodeFuse-DeepSeek-33B**.  |     **78.65%**    | 2024.01 |
@@ -287,11 +288,11 @@ System instruction
 <s>human
 Human 1st round input
 <s>bot
-Bot 1st round output<｜end▁of▁sentence｜>
 <s>human
 Human 2nd round input
 <s>bot
-Bot 2nd round output<｜end▁of▁sentence｜>
 ...
 ...
 ...

 | 模型                          | HumanEval(pass@1) |   日期    |
 |:----------------------------|:-----------------:|:-------:|
+| **CodeFuse-DeepSeek-33B**   |     **78.65%**    | 2024.01 |
+| **CodeFuse-Mixtral-8x7B**   |     **56.10%**    | 2024.01 |
 | **CodeFuse-CodeLlama-34B**  |     74.4%      | 2023.9  |
 |**CodeFuse-CodeLlama-34B-4bits** |     73.8%  |  2023.9 |
+| **CodeFuse-StarCoder-15B**  |     54.9%         | 2023.9  |
+| **CodeFuse-QWen-14B**       |     48.78%        | 2023.10 |
+| **CodeFuse-CodeGeeX2-6B**   |     45.12%        | 2023.11 |
 | WizardCoder-Python-34B-V1.0 |       73.2%       | 2023.8  |
 | GPT-4(zero-shot)            |       67.0%       | 2023.3  |
 | PanGu-Coder2 15B            |       61.6%       | 2023.8  |
 | GPT-3.5(zero-shot)          |       48.1%       | 2022.11 |
 | OctoCoder                   |       46.2%       | 2023.8  |
 | StarCoder-15B               |       33.6%       | 2023.5  |
+| Qwen-14b                    |       32.3%       | 2023.10 |
 <s>human
 Human 1st round input
 <s>bot
+Bot 1st round output</s>
 <s>human
 Human 2nd round input
 <s>bot
+Bot 2nd round output</s>
 ...
 ...
 ...