Instructions to use petals-team/StableBeluga2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use petals-team/StableBeluga2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="petals-team/StableBeluga2")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("petals-team/StableBeluga2")
model = AutoModelForCausalLM.from_pretrained("petals-team/StableBeluga2")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use petals-team/StableBeluga2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "petals-team/StableBeluga2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "petals-team/StableBeluga2",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/petals-team/StableBeluga2

SGLang

How to use petals-team/StableBeluga2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "petals-team/StableBeluga2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "petals-team/StableBeluga2",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "petals-team/StableBeluga2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "petals-team/StableBeluga2",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use petals-team/StableBeluga2 with Docker Model Runner:
```
docker model run hf.co/petals-team/StableBeluga2
```

StableBeluga2

Commit History

Fix safetensors metadata

5e87035

borzunov commited on Aug 23, 2023

Add missing tokenizer files

6c06ad7

borzunov commited on Aug 23, 2023

Explain changes and their meaning for Petals

0cc4f70

borzunov commited on Aug 23, 2023

Minimize diff with original configs

69185c0

borzunov commited on Aug 23, 2023

Upload resharded weights

5a20760

borzunov commited on Aug 23, 2023

Delete old shards

019b4d8

borzunov commited on Aug 23, 2023

Use original readme + changelog

42a755a

borzunov commited on Aug 20, 2023

Copy licenses from the original repo

205cb82

borzunov commited on Aug 20, 2023

:x

210a573

j879 commited on Aug 13, 2023

add json files

1116dd6

j879 commited on Aug 13, 2023

update dht prefix

f3703b0

j879 commited on Aug 13, 2023

add bf16 bin weight file

76303f9

j879 commited on Aug 13, 2023

add weight files

7a3f408

j879 commited on Aug 13, 2023

add dht prefix in config

335cdc1

j879 commited on Aug 13, 2023

add weight file

0c28311

j879 commited on Aug 12, 2023

Merge branch 'main' of https://huggingface.co/j879/StableBeluga2-bf16

e428330

j879 commited on Aug 12, 2023

add json files

2e7dcd0

j879 commited on Aug 12, 2023

Update README.md

a439bba

j879 commited on Aug 12, 2023

Update README.md

103f077

j879 commited on Aug 12, 2023

Create README.md

72ce34f

j879 commited on Aug 12, 2023

initial commit

2142dbb

j879 commited on Aug 12, 2023

Commit History

Fix safetensors metadata 5e87035

Add missing tokenizer files 6c06ad7

Explain changes and their meaning for Petals 0cc4f70

Minimize diff with original configs 69185c0

Upload resharded weights 5a20760

Delete old shards 019b4d8

Use original readme + changelog 42a755a

Copy licenses from the original repo 205cb82

:x 210a573

add json files 1116dd6

update dht prefix f3703b0

add bf16 bin weight file 76303f9

add weight files 7a3f408

add dht prefix in config 335cdc1

add weight file 0c28311

Merge branch 'main' of https://huggingface.co/j879/StableBeluga2-bf16 e428330

add json files 2e7dcd0

Update README.md a439bba

Update README.md 103f077

Create README.md 72ce34f

initial commit 2142dbb

Fix safetensors metadata

5e87035

Add missing tokenizer files

6c06ad7

Explain changes and their meaning for Petals

0cc4f70

Minimize diff with original configs

69185c0

Upload resharded weights

5a20760

Delete old shards

019b4d8

Use original readme + changelog

42a755a

Copy licenses from the original repo

205cb82

:x

210a573

add json files

1116dd6

update dht prefix

f3703b0

add bf16 bin weight file

76303f9

add weight files

7a3f408

add dht prefix in config

335cdc1

add weight file

0c28311

Merge branch 'main' of https://huggingface.co/j879/StableBeluga2-bf16

e428330

add json files

2e7dcd0

Update README.md

a439bba

Update README.md

103f077

Create README.md

72ce34f

initial commit

2142dbb