Instructions to use upstage/solar-pro-preview-instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use upstage/solar-pro-preview-instruct with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="upstage/solar-pro-preview-instruct", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("upstage/solar-pro-preview-instruct", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use upstage/solar-pro-preview-instruct with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "upstage/solar-pro-preview-instruct"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "upstage/solar-pro-preview-instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/upstage/solar-pro-preview-instruct

SGLang

How to use upstage/solar-pro-preview-instruct with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "upstage/solar-pro-preview-instruct" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "upstage/solar-pro-preview-instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "upstage/solar-pro-preview-instruct" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "upstage/solar-pro-preview-instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use upstage/solar-pro-preview-instruct with Docker Model Runner:
```
docker model run hf.co/upstage/solar-pro-preview-instruct
```

API access - add /models endpoint.

#21

by antonhugs - opened Nov 25, 2024

Discussion

antonhugs

Nov 25, 2024

This is an ask to add a "/models" endpoint to your API (https://console.upstage.ai/).

Why?
Many existing ui-clients call /models first to understand what model name to use for chat completion.
E.g.: Im open-webui user. It is not possible to integrate your API into it since your backend does not implement /models enpoint. open-webui does not understand what model name to use.

HF is probably not a place for such a request. Anyhow I hope you can forward an ask to your backend peeps.

PS. Tried model locally and it works great. 4K context is sad. However it is more than enough for quick questions.

Cheers!

nayeonkim0

Nov 28, 2024

Hi @antonhugs ! I am Nayeon from Upstage.
I'm glad to hear that you are enjoying using Solar. What use case are you working on?
Solar is available on Ollama, so you might consider using it through there. Thank you for your suggestion. We will discuss your suggestions. 🤗

antonhugs

Nov 29, 2024

Im using it with vllm however I'd rather use an API and not run it myself.
My use cases include simple one-off questions that include programming questions, rephrasing support replies, general info questions.
What i like about the model is that it sticks to system prompt better than anything else I tried. E.g. when I say limit responses to 2 sentences it does so when most other LLMs spit out BS by over-explaining itself.

antonhugs changed discussion status to closed Nov 29, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment