Instructions to use cloudyu/mistral_28B_instruct_v0.2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use cloudyu/mistral_28B_instruct_v0.2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="cloudyu/mistral_28B_instruct_v0.2") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("cloudyu/mistral_28B_instruct_v0.2") model = AutoModelForCausalLM.from_pretrained("cloudyu/mistral_28B_instruct_v0.2") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use cloudyu/mistral_28B_instruct_v0.2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "cloudyu/mistral_28B_instruct_v0.2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cloudyu/mistral_28B_instruct_v0.2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/cloudyu/mistral_28B_instruct_v0.2
- SGLang
How to use cloudyu/mistral_28B_instruct_v0.2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "cloudyu/mistral_28B_instruct_v0.2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cloudyu/mistral_28B_instruct_v0.2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "cloudyu/mistral_28B_instruct_v0.2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cloudyu/mistral_28B_instruct_v0.2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use cloudyu/mistral_28B_instruct_v0.2 with Docker Model Runner:
docker model run hf.co/cloudyu/mistral_28B_instruct_v0.2
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("cloudyu/mistral_28B_instruct_v0.2")
model = AutoModelForCausalLM.from_pretrained("cloudyu/mistral_28B_instruct_v0.2")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))Quick Links
this is a 128 layers model based on mistral architecture
DPO by Intel/orca_dpo_pairs.
template is "{instruction} {inputs} \n {output}"
example output
<s> write a story about yosemite.
Once upon a time, in the heart of California's Sierra Nevada Mountains, there was a land known as Yosemite National Park. This magical place was home to some of nature's most breathtaking wonders, from its towering redwood trees to its crystal clear streams and icy alpine lakes.
In this enchanted land lived a family of animals that called Yosemite their home. There were grizzly bears roaming free through the forests, while coyotes howled at the moon on clear night. A family of mules had made their home among the cliffs, while a group of rabbits danced and frolicked in the meadow.
One day, a young doe named Bella set out on an adventure to explore the wondrous landscapes of Yosemite. She followed the path of the river upstream, her eyes fixed on the top of the mountain where she could see the first glints of snow. As she climbed higher and higher, she met other animals on her journey - a family of foxes, a family of beavers, even a family of owls! They all shared stories of their adventures in Yosemite, and Bella listened with amazement.
Eventually, Bella reached the top of the mountain, where she was greeted by the most stunning view she had ever seen. Below her, the valley spread out like a green sea, filled with every kind of plant life and animal species. The sun set over the horizon, painting the sky in shades of pink and orange.
Bella knew that she would never forget this moment, or the magic of Yosemite. She took one last look before starting her journey back down the mountain, her heart filled with gratitude for the wonders of nature that surrounded her. From then on, Bella explored Yosemite every day, always finding something new and amazing around every corner. And so it continued, year after year, as Yosemite remained a land of magic and wonder, a paradise for all who dared to explore its secrets.</s>
- Downloads last month
- 89
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="cloudyu/mistral_28B_instruct_v0.2") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)