Instructions to use cloudyu/Mixtral_34Bx2_MoE_60B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use cloudyu/Mixtral_34Bx2_MoE_60B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="cloudyu/Mixtral_34Bx2_MoE_60B")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("cloudyu/Mixtral_34Bx2_MoE_60B")
model = AutoModelForCausalLM.from_pretrained("cloudyu/Mixtral_34Bx2_MoE_60B")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use cloudyu/Mixtral_34Bx2_MoE_60B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "cloudyu/Mixtral_34Bx2_MoE_60B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "cloudyu/Mixtral_34Bx2_MoE_60B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/cloudyu/Mixtral_34Bx2_MoE_60B

SGLang

How to use cloudyu/Mixtral_34Bx2_MoE_60B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "cloudyu/Mixtral_34Bx2_MoE_60B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "cloudyu/Mixtral_34Bx2_MoE_60B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "cloudyu/Mixtral_34Bx2_MoE_60B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "cloudyu/Mixtral_34Bx2_MoE_60B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use cloudyu/Mixtral_34Bx2_MoE_60B with Docker Model Runner:
```
docker model run hf.co/cloudyu/Mixtral_34Bx2_MoE_60B
```

Should not be called mixtral, the models made into the moe are yi based

by teknium - opened Jan 7, 2024

Discussion

teknium

Jan 7, 2024

Mixtral is a whole other base model lol

NeuralNovel

Jan 7, 2024

I'm with teknium, this name could be misleading.

Yhyu13

Jan 7, 2024

•

edited Jan 7, 2024

Yup, could simply be Yi-34Bx2-MoE, but it's ok

jreoka

Jan 7, 2024

It does use the mixtral method though, so there is a half-truth to it

stolsvik

Jan 8, 2024

•

edited Jan 8, 2024

I agree, Mixtral is a specific model by Mistral.AI, and it is very confusing when you name all your models in this way.
Your models are Mixture of Experts models, "MoE", and the model Mixtral has nothing to do with them (other than Mixtral also using a MoE approach, which obviously was their reason for calling it Mixtral, punning on their name Mistral, and Mixture)
Very interesting models, though - but please change your naming scheme!!

gblazex

Jan 8, 2024

It's easy to ask for renaming with weyaxi renamer tool:
https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-renamer

Just enter your repo name and HF token and it'll generate a pull request for leaderboard name change.

cloudyu

Owner Jan 8, 2024

the reason why called mixtral is that the model is based on architecture of MixtralForCausalLM, if you take a look at the config file.

"architectures": [ "MixtralForCausalLM" ].

I haven’t thought of a new name yet.

NeuralNovel

Jan 8, 2024

I think you should call it cloud9 :D

SamuelAzran

Jan 10, 2024

the reason why called mixtral is that the model is based on architecture of MixtralForCausalLM, if you take a look at the config file.

"architectures": [ "MixtralForCausalLM" ].

I haven’t thought of a new name yet.

How about one of the following names:
Yi-Mixtral_34Bx2_MoE_60B
MixYi-34Bx2_MoE_60B
MiYi-34Bx2_MoE_60B
Yi-34Bx2_MoE_60B

The name "Mixtral" imply "Mistral" based mixture-of-experts.

Regardless of the name, we'd love to learn more about your process. The results looks extremely promising.

MaziyarPanahi

Feb 13, 2024

I like Yi-34Bx2_MoE_60B, short and represents everything this model has to offer

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment