Instructions to use upstage/solar-pro-preview-instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use upstage/solar-pro-preview-instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="upstage/solar-pro-preview-instruct", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("upstage/solar-pro-preview-instruct", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use upstage/solar-pro-preview-instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "upstage/solar-pro-preview-instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "upstage/solar-pro-preview-instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/upstage/solar-pro-preview-instruct
- SGLang
How to use upstage/solar-pro-preview-instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "upstage/solar-pro-preview-instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "upstage/solar-pro-preview-instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "upstage/solar-pro-preview-instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "upstage/solar-pro-preview-instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use upstage/solar-pro-preview-instruct with Docker Model Runner:
docker model run hf.co/upstage/solar-pro-preview-instruct
API access - add /models endpoint.
This is an ask to add a "/models" endpoint to your API (https://console.upstage.ai/).
Why?
Many existing ui-clients call /models first to understand what model name to use for chat completion.
E.g.: Im open-webui user. It is not possible to integrate your API into it since your backend does not implement /models enpoint. open-webui does not understand what model name to use.
HF is probably not a place for such a request. Anyhow I hope you can forward an ask to your backend peeps.
PS. Tried model locally and it works great. 4K context is sad. However it is more than enough for quick questions.
Cheers!
Hi @antonhugs ! I am Nayeon from Upstage.
I'm glad to hear that you are enjoying using Solar. What use case are you working on?
Solar is available on Ollama, so you might consider using it through there. Thank you for your suggestion. We will discuss your suggestions. π€
Im using it with vllm however I'd rather use an API and not run it myself.
My use cases include simple one-off questions that include programming questions, rephrasing support replies, general info questions.
What i like about the model is that it sticks to system prompt better than anything else I tried. E.g. when I say limit responses to 2 sentences it does so when most other LLMs spit out BS by over-explaining itself.