Instructions to use aws-neuron/CodeLlama-7b-hf-neuron-8xlarge with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use aws-neuron/CodeLlama-7b-hf-neuron-8xlarge with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="aws-neuron/CodeLlama-7b-hf-neuron-8xlarge")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("aws-neuron/CodeLlama-7b-hf-neuron-8xlarge")
model = AutoModelForCausalLM.from_pretrained("aws-neuron/CodeLlama-7b-hf-neuron-8xlarge")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use aws-neuron/CodeLlama-7b-hf-neuron-8xlarge with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "aws-neuron/CodeLlama-7b-hf-neuron-8xlarge"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "aws-neuron/CodeLlama-7b-hf-neuron-8xlarge",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/aws-neuron/CodeLlama-7b-hf-neuron-8xlarge

SGLang

How to use aws-neuron/CodeLlama-7b-hf-neuron-8xlarge with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "aws-neuron/CodeLlama-7b-hf-neuron-8xlarge" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "aws-neuron/CodeLlama-7b-hf-neuron-8xlarge",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "aws-neuron/CodeLlama-7b-hf-neuron-8xlarge" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "aws-neuron/CodeLlama-7b-hf-neuron-8xlarge",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use aws-neuron/CodeLlama-7b-hf-neuron-8xlarge with Docker Model Runner:
```
docker model run hf.co/aws-neuron/CodeLlama-7b-hf-neuron-8xlarge
```

jburtoft commited on Dec 29, 2023

Commit

fc7fdc7

1 Parent(s): a4eaef1

5ea8ae4b66c2b96996e5272ae2b832b94ce3e60aa5e3a41d71612240888e78be

Browse files

Files changed (23) hide show

checkpoint/pytorch_model.bin/p156.model.layers.17.self_attn.v_proj.weight +3 -0
checkpoint/pytorch_model.bin/p157.model.layers.17.self_attn.o_proj.weight +3 -0
checkpoint/pytorch_model.bin/p158.model.layers.17.mlp.gate_proj.weight +3 -0
checkpoint/pytorch_model.bin/p159.model.layers.17.mlp.up_proj.weight +3 -0
checkpoint/pytorch_model.bin/p16.model.layers.1.mlp.down_proj.weight +3 -0
checkpoint/pytorch_model.bin/p160.model.layers.17.mlp.down_proj.weight +3 -0
checkpoint/pytorch_model.bin/p161.model.layers.17.input_layernorm.weight +3 -0
checkpoint/pytorch_model.bin/p162.model.layers.17.post_attention_layernorm.weight +3 -0
checkpoint/pytorch_model.bin/p163.model.layers.18.self_attn.q_proj.weight +3 -0
checkpoint/pytorch_model.bin/p164.model.layers.18.self_attn.k_proj.weight +3 -0
checkpoint/pytorch_model.bin/p165.model.layers.18.self_attn.v_proj.weight +3 -0
checkpoint/pytorch_model.bin/p166.model.layers.18.self_attn.o_proj.weight +3 -0
checkpoint/pytorch_model.bin/p167.model.layers.18.mlp.gate_proj.weight +3 -0
checkpoint/pytorch_model.bin/p168.model.layers.18.mlp.up_proj.weight +3 -0
checkpoint/pytorch_model.bin/p169.model.layers.18.mlp.down_proj.weight +3 -0
checkpoint/pytorch_model.bin/p17.model.layers.1.input_layernorm.weight +3 -0
checkpoint/pytorch_model.bin/p170.model.layers.18.input_layernorm.weight +3 -0
checkpoint/pytorch_model.bin/p171.model.layers.18.post_attention_layernorm.weight +3 -0
checkpoint/pytorch_model.bin/p172.model.layers.19.self_attn.q_proj.weight +3 -0
checkpoint/pytorch_model.bin/p173.model.layers.19.self_attn.k_proj.weight +3 -0
checkpoint/pytorch_model.bin/p174.model.layers.19.self_attn.v_proj.weight +3 -0
checkpoint/pytorch_model.bin/p175.model.layers.19.self_attn.o_proj.weight +3 -0
checkpoint/pytorch_model.bin/p176.model.layers.19.mlp.gate_proj.weight +3 -0

checkpoint/pytorch_model.bin/p156.model.layers.17.self_attn.v_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:14a4074cc4cefa0ca33550e048f9032efacb033e1663353ac75dd13efbf8cf76
+size 67109765

checkpoint/pytorch_model.bin/p157.model.layers.17.self_attn.o_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:56c49455eafd4fafb2c0056f0a18de82a69f8c703b785f718bc8d733e6ea3d05
+size 67109765

checkpoint/pytorch_model.bin/p158.model.layers.17.mlp.gate_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6a37bd3157d8c7da781894a4ebabd590f0883854a690dfc8fcd42f229b9d75a9
+size 180355964

checkpoint/pytorch_model.bin/p159.model.layers.17.mlp.up_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f6d5d5d9f188e53fefe29b5801b746def85fdbefbfbb86c9ee45748240fc9c6a
+size 180355958

checkpoint/pytorch_model.bin/p16.model.layers.1.mlp.down_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:98ced61bc43ac6cd2f94543f14db73b190c165814c60ef798f9a781c8c7b1cdf
+size 180355958

checkpoint/pytorch_model.bin/p160.model.layers.17.mlp.down_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:86190d1a40080403d28eb40a7e96913d9e1e48f4998018b9fa81ddf8288e684e
+size 180355964

checkpoint/pytorch_model.bin/p161.model.layers.17.input_layernorm.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:95f4025af369110095a7942bcba5eec6ab79555813c708fe773af47ee7aaf721
+size 17282

checkpoint/pytorch_model.bin/p162.model.layers.17.post_attention_layernorm.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:83b24c4e3caac37d981fc7d70999bde6adf0dff9c279d9011170109fc3b6bc7e
+size 17309

checkpoint/pytorch_model.bin/p163.model.layers.18.self_attn.q_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c132d946183131160102f2aade2021a6ec4e9186ae05ee85ce332ca598339619
+size 67109765

checkpoint/pytorch_model.bin/p164.model.layers.18.self_attn.k_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bbef6d1f7784bddc0e2322850d67b4726551e8d40c06898621af7150bc644912
+size 67109765

checkpoint/pytorch_model.bin/p165.model.layers.18.self_attn.v_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:216b73287b5827f8ab7a6cd76cfe502206b1559fd6cee7291eb9ecf2cc7fde1d
+size 67109765

checkpoint/pytorch_model.bin/p166.model.layers.18.self_attn.o_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2c3f1132b3073f58e84b9ebdc1cfb74212a6d44caa67ead1290730cc92691d83
+size 67109765

checkpoint/pytorch_model.bin/p167.model.layers.18.mlp.gate_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fb2be9716f22fc6f073472f3b5ca12fb10cc50fb7139d3b7bcf023afabd43ffb
+size 180355964

checkpoint/pytorch_model.bin/p168.model.layers.18.mlp.up_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6e3f78334d704fd94e83aeceb09232442615bdd2193d4d45d1fde9fd61e76240
+size 180355958

checkpoint/pytorch_model.bin/p169.model.layers.18.mlp.down_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5c2ea4c034bae3e302c117d7255a8eb977eae06480f7e002881553b02e9268aa
+size 180355964

checkpoint/pytorch_model.bin/p17.model.layers.1.input_layernorm.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:425da3e5e4c0ee6e1e2d5d71defd97f01358acf6e7a7b3db949ec938ad0d9868
+size 17276

checkpoint/pytorch_model.bin/p170.model.layers.18.input_layernorm.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:be91d1d8789d56759c1c0d4bccecaaf525b5b2187fece145f3471ee162b2d075
+size 17282

checkpoint/pytorch_model.bin/p171.model.layers.18.post_attention_layernorm.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fdac8e7793cf8f18300bac0d31ab65815b8604a3b0ff34755293fa7584e957f9
+size 17309

checkpoint/pytorch_model.bin/p172.model.layers.19.self_attn.q_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3c6bf5226a8eeefdf8557bdaf5323c3f61a44985833d705e3014c37f3bb589d5
+size 67109765

checkpoint/pytorch_model.bin/p173.model.layers.19.self_attn.k_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d373d69e5a7f4f80fb2ab07f921858446d78131a120d1b0b150d2c7265fadaab
+size 67109765

checkpoint/pytorch_model.bin/p174.model.layers.19.self_attn.v_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f173fdb2cc791a49a4ed918061d4e0fc8590e27a3863556d597f85a9072fa994
+size 67109765

checkpoint/pytorch_model.bin/p175.model.layers.19.self_attn.o_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ad4f55b8794763516db442b0414881e09f3f6c7677b4c3b24031e4e7c1135220
+size 67109765

checkpoint/pytorch_model.bin/p176.model.layers.19.mlp.gate_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:74aabb9caaa2fd68f349767472cf3ff7d5de514197caf4461ff5cce5de00af66
+size 180355964