Instructions to use codellama/CodeLlama-7b-hf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use codellama/CodeLlama-7b-hf with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="codellama/CodeLlama-7b-hf")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-7b-hf")
model = AutoModelForCausalLM.from_pretrained("codellama/CodeLlama-7b-hf")

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use codellama/CodeLlama-7b-hf with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "codellama/CodeLlama-7b-hf"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "codellama/CodeLlama-7b-hf",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/codellama/CodeLlama-7b-hf

SGLang

How to use codellama/CodeLlama-7b-hf with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "codellama/CodeLlama-7b-hf" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "codellama/CodeLlama-7b-hf",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "codellama/CodeLlama-7b-hf" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "codellama/CodeLlama-7b-hf",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use codellama/CodeLlama-7b-hf with Docker Model Runner:
```
docker model run hf.co/codellama/CodeLlama-7b-hf
```

loubnabnl HF Staff commited on Aug 28, 2023

Commit

a6a8170

1 Parent(s): ab8c802

delete unused checkpoints (#12)

Browse files

- delete unused checkpoints (7f53ce00ce7ec921b6c69a47dbae369cbf0a37ca)
- Delete pytorch_model-00001-of-00006.bin (60970c639e823afddd6f5988c26dc7314d83e7bb)
- Delete pytorch_model-00002-of-00006.bin (2372c9b91d1c328e03246eecbc4804875fb478a6)
- Delete pytorch_model-00003-of-00006.bin (34d6f3879bf185a4814b0205258439fdf3614c94)
- Delete pytorch_model-00004-of-00006.bin (acf0537632a52172c13e09908315326260472e0f)
- Delete pytorch_model-00005-of-00006.bin (ca72bca3339ab20ee9a54c5de6057069a45e9397)
- Delete pytorch_model-00006-of-00006.bin (c1dcd7c84bb36a4a47fa6642fcb4af78a2a517e5)
- Delete pytorch_model-00002-of-00002.bin (06c1b70fdef6e16ffcc7d43f4270df09c9c54394)

Files changed (8) hide show

pytorch_model-00001-of-00002.bin +0 -3
pytorch_model-00001-of-00006.bin +0 -3
pytorch_model-00002-of-00002.bin +0 -3
pytorch_model-00002-of-00006.bin +0 -3
pytorch_model-00003-of-00006.bin +0 -3
pytorch_model-00004-of-00006.bin +0 -3
pytorch_model-00005-of-00006.bin +0 -3
pytorch_model-00006-of-00006.bin +0 -3

pytorch_model-00001-of-00002.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:c4caf89703f85f00c3ebcb8edf783368ee3e2f330a91d8df07328c92ffa66e97
-size 9976751194

pytorch_model-00001-of-00006.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:e30e70a7eef2d5b1ccc10d6eeac89cd41ff78527fce9fd26430fa00695232668
-size 4840669603

pytorch_model-00002-of-00002.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:957c532fa86fe4b09a64af2fb52440026ebc6edeff1636f491d860a3addbeeaa
-size 3500441859

pytorch_model-00002-of-00006.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:0c5a371c7948eece0cd6eef92164dfc7dc74ee70e6250348ab18754dd71410bc
-size 4857218807

pytorch_model-00003-of-00006.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:494ec74e75ccba58a8351163befff8242b5e8058bc95712449ba685d5d7332f1
-size 4857218807

pytorch_model-00004-of-00006.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:da0093854fe2d8591ea7dfcba6a59ec944cfee399542c09936c4bb7edcb7643e
-size 4857218807

pytorch_model-00005-of-00006.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:e04c0f735ed78f0a93921ebf81906c194c9046b9008d34eec7624297c62b4b18
-size 4857218807

pytorch_model-00006-of-00006.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:4b881b122d174dd6dde8254eb2dbbc7afa9c40ed3c7c1641d08649c6377fa9c0
-size 2684739793