Instructions to use codellama/CodeLlama-7b-hf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use codellama/CodeLlama-7b-hf with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="codellama/CodeLlama-7b-hf")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-7b-hf") model = AutoModelForCausalLM.from_pretrained("codellama/CodeLlama-7b-hf") - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use codellama/CodeLlama-7b-hf with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "codellama/CodeLlama-7b-hf" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "codellama/CodeLlama-7b-hf", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/codellama/CodeLlama-7b-hf
- SGLang
How to use codellama/CodeLlama-7b-hf with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "codellama/CodeLlama-7b-hf" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "codellama/CodeLlama-7b-hf", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "codellama/CodeLlama-7b-hf" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "codellama/CodeLlama-7b-hf", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use codellama/CodeLlama-7b-hf with Docker Model Runner:
docker model run hf.co/codellama/CodeLlama-7b-hf
delete unused checkpoints (#12)
Browse files- delete unused checkpoints (7f53ce00ce7ec921b6c69a47dbae369cbf0a37ca)
- Delete pytorch_model-00001-of-00006.bin (60970c639e823afddd6f5988c26dc7314d83e7bb)
- Delete pytorch_model-00002-of-00006.bin (2372c9b91d1c328e03246eecbc4804875fb478a6)
- Delete pytorch_model-00003-of-00006.bin (34d6f3879bf185a4814b0205258439fdf3614c94)
- Delete pytorch_model-00004-of-00006.bin (acf0537632a52172c13e09908315326260472e0f)
- Delete pytorch_model-00005-of-00006.bin (ca72bca3339ab20ee9a54c5de6057069a45e9397)
- Delete pytorch_model-00006-of-00006.bin (c1dcd7c84bb36a4a47fa6642fcb4af78a2a517e5)
- Delete pytorch_model-00002-of-00002.bin (06c1b70fdef6e16ffcc7d43f4270df09c9c54394)
- pytorch_model-00001-of-00002.bin +0 -3
- pytorch_model-00001-of-00006.bin +0 -3
- pytorch_model-00002-of-00002.bin +0 -3
- pytorch_model-00002-of-00006.bin +0 -3
- pytorch_model-00003-of-00006.bin +0 -3
- pytorch_model-00004-of-00006.bin +0 -3
- pytorch_model-00005-of-00006.bin +0 -3
- pytorch_model-00006-of-00006.bin +0 -3
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:c4caf89703f85f00c3ebcb8edf783368ee3e2f330a91d8df07328c92ffa66e97
|
| 3 |
-
size 9976751194
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:e30e70a7eef2d5b1ccc10d6eeac89cd41ff78527fce9fd26430fa00695232668
|
| 3 |
-
size 4840669603
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:957c532fa86fe4b09a64af2fb52440026ebc6edeff1636f491d860a3addbeeaa
|
| 3 |
-
size 3500441859
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:0c5a371c7948eece0cd6eef92164dfc7dc74ee70e6250348ab18754dd71410bc
|
| 3 |
-
size 4857218807
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:494ec74e75ccba58a8351163befff8242b5e8058bc95712449ba685d5d7332f1
|
| 3 |
-
size 4857218807
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:da0093854fe2d8591ea7dfcba6a59ec944cfee399542c09936c4bb7edcb7643e
|
| 3 |
-
size 4857218807
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:e04c0f735ed78f0a93921ebf81906c194c9046b9008d34eec7624297c62b4b18
|
| 3 |
-
size 4857218807
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:4b881b122d174dd6dde8254eb2dbbc7afa9c40ed3c7c1641d08649c6377fa9c0
|
| 3 |
-
size 2684739793
|
|
|
|
|
|
|
|
|
|
|
|