Instructions to use cloudyu/Mixtral_34Bx2_MoE_60B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use cloudyu/Mixtral_34Bx2_MoE_60B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="cloudyu/Mixtral_34Bx2_MoE_60B")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("cloudyu/Mixtral_34Bx2_MoE_60B") model = AutoModelForCausalLM.from_pretrained("cloudyu/Mixtral_34Bx2_MoE_60B") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use cloudyu/Mixtral_34Bx2_MoE_60B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "cloudyu/Mixtral_34Bx2_MoE_60B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cloudyu/Mixtral_34Bx2_MoE_60B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/cloudyu/Mixtral_34Bx2_MoE_60B
- SGLang
How to use cloudyu/Mixtral_34Bx2_MoE_60B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "cloudyu/Mixtral_34Bx2_MoE_60B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cloudyu/Mixtral_34Bx2_MoE_60B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "cloudyu/Mixtral_34Bx2_MoE_60B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cloudyu/Mixtral_34Bx2_MoE_60B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use cloudyu/Mixtral_34Bx2_MoE_60B with Docker Model Runner:
docker model run hf.co/cloudyu/Mixtral_34Bx2_MoE_60B
Should not be called mixtral, the models made into the moe are yi based
Mixtral is a whole other base model lol
I'm with teknium, this name could be misleading.
Yup, could simply be Yi-34Bx2-MoE, but it's ok
It does use the mixtral method though, so there is a half-truth to it
I agree, Mixtral is a specific model by Mistral.AI, and it is very confusing when you name all your models in this way.
Your models are Mixture of Experts models, "MoE", and the model Mixtral has nothing to do with them (other than Mixtral also using a MoE approach, which obviously was their reason for calling it Mixtral, punning on their name Mistral, and Mixture)
Very interesting models, though - but please change your naming scheme!!
It's easy to ask for renaming with weyaxi renamer tool:
https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-renamer
Just enter your repo name and HF token and it'll generate a pull request for leaderboard name change.
the reason why called mixtral is that the model is based on architecture of MixtralForCausalLM, if you take a look at the config file.
"architectures": [ "MixtralForCausalLM" ].
I haven’t thought of a new name yet.
I think you should call it cloud9 :D
the reason why called mixtral is that the model is based on architecture of MixtralForCausalLM, if you take a look at the config file.
"architectures": [ "MixtralForCausalLM" ].
I haven’t thought of a new name yet.
How about one of the following names:
Yi-Mixtral_34Bx2_MoE_60B
MixYi-34Bx2_MoE_60B
MiYi-34Bx2_MoE_60B
Yi-34Bx2_MoE_60B
The name "Mixtral" imply "Mistral" based mixture-of-experts.
Regardless of the name, we'd love to learn more about your process. The results looks extremely promising.
I like Yi-34Bx2_MoE_60B, short and represents everything this model has to offer