Instructions to use Open-Orca/Mistral-7B-OpenOrca with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Open-Orca/Mistral-7B-OpenOrca with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Open-Orca/Mistral-7B-OpenOrca") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Open-Orca/Mistral-7B-OpenOrca") model = AutoModelForCausalLM.from_pretrained("Open-Orca/Mistral-7B-OpenOrca") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Open-Orca/Mistral-7B-OpenOrca with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Open-Orca/Mistral-7B-OpenOrca" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Open-Orca/Mistral-7B-OpenOrca", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Open-Orca/Mistral-7B-OpenOrca
- SGLang
How to use Open-Orca/Mistral-7B-OpenOrca with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Open-Orca/Mistral-7B-OpenOrca" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Open-Orca/Mistral-7B-OpenOrca", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Open-Orca/Mistral-7B-OpenOrca" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Open-Orca/Mistral-7B-OpenOrca", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Open-Orca/Mistral-7B-OpenOrca with Docker Model Runner:
docker model run hf.co/Open-Orca/Mistral-7B-OpenOrca
Model makes different inferences in different envs
👀 1
#32 opened almost 2 years ago
by
ayseozgun
How to drop the stop token from the response?
#31 opened about 2 years ago
by
mattma1970
Has anyone tried to perform batch inference with model?
#30 opened over 2 years ago
by
xnaxi
Dataset filtering
#29 opened over 2 years ago
by
mchochowski
Not following system prompt
👍 1
#28 opened over 2 years ago
by
wehapi
Multi-round and other samples of code and documentation
#27 opened over 2 years ago
by
decunde
Adding `safetensors` variant of this model
#26 opened over 2 years ago
by
SFconvertbot
How is this model multi-round?
1
#25 opened over 2 years ago
by
timlim123
[AUTOMATED] Model Memory Requirements
#22 opened over 2 years ago
by
model-sizer-bot
"OutOfMemoryError: CUDA out of memory"
1
#21 opened over 2 years ago
by
Anuraag-pal
Update README.md
#20 opened over 2 years ago
by
Vinhad0914
Problem with streaming support
5
#17 opened over 2 years ago
by
mattma1970
Why is Open Orca trained to say a fact isn't true just because it can't find said fact?
2
#16 opened over 2 years ago
by deleted
Does your fine-tuning process overfit?
2
#15 opened over 2 years ago
by
jiaxiangc
Fix typo in chat Template
#13 opened over 2 years ago
by
Ichsan2895
Not able to display numbered tables
#12 opened over 2 years ago
by
Hyperion-js
Not able to launch using TGI in Sagemaker
#11 opened over 2 years ago
by
aastha6
LangChain promt template
#10 opened over 2 years ago
by
fissium
ChatML prompt format problems
👍 4
3
#7 opened over 2 years ago
by
kalomaze
Free and ready to use Mistral-7B-OpenOrca-GGUF model as OpenAI API compatible endpoint
🤗 1
#6 opened over 2 years ago
by
limcheekin
Specs for inference
5
#5 opened over 2 years ago
by
mzhadigerov
Can’t get to work in inference endpoints
2
#3 opened over 2 years ago
by
joeofportland
I'm getting error : <unc> set to 0 in the tokenizer config
6
#2 opened over 2 years ago
by
Tonic