Instructions to use aws-neuron/CodeLlama-7b-hf-neuron-8xlarge with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use aws-neuron/CodeLlama-7b-hf-neuron-8xlarge with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="aws-neuron/CodeLlama-7b-hf-neuron-8xlarge")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("aws-neuron/CodeLlama-7b-hf-neuron-8xlarge") model = AutoModelForCausalLM.from_pretrained("aws-neuron/CodeLlama-7b-hf-neuron-8xlarge") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use aws-neuron/CodeLlama-7b-hf-neuron-8xlarge with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "aws-neuron/CodeLlama-7b-hf-neuron-8xlarge" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aws-neuron/CodeLlama-7b-hf-neuron-8xlarge", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/aws-neuron/CodeLlama-7b-hf-neuron-8xlarge
- SGLang
How to use aws-neuron/CodeLlama-7b-hf-neuron-8xlarge with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "aws-neuron/CodeLlama-7b-hf-neuron-8xlarge" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aws-neuron/CodeLlama-7b-hf-neuron-8xlarge", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "aws-neuron/CodeLlama-7b-hf-neuron-8xlarge" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aws-neuron/CodeLlama-7b-hf-neuron-8xlarge", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use aws-neuron/CodeLlama-7b-hf-neuron-8xlarge with Docker Model Runner:
docker model run hf.co/aws-neuron/CodeLlama-7b-hf-neuron-8xlarge
5ea8ae4b66c2b96996e5272ae2b832b94ce3e60aa5e3a41d71612240888e78be
Browse files- checkpoint/pytorch_model.bin/p156.model.layers.17.self_attn.v_proj.weight +3 -0
- checkpoint/pytorch_model.bin/p157.model.layers.17.self_attn.o_proj.weight +3 -0
- checkpoint/pytorch_model.bin/p158.model.layers.17.mlp.gate_proj.weight +3 -0
- checkpoint/pytorch_model.bin/p159.model.layers.17.mlp.up_proj.weight +3 -0
- checkpoint/pytorch_model.bin/p16.model.layers.1.mlp.down_proj.weight +3 -0
- checkpoint/pytorch_model.bin/p160.model.layers.17.mlp.down_proj.weight +3 -0
- checkpoint/pytorch_model.bin/p161.model.layers.17.input_layernorm.weight +3 -0
- checkpoint/pytorch_model.bin/p162.model.layers.17.post_attention_layernorm.weight +3 -0
- checkpoint/pytorch_model.bin/p163.model.layers.18.self_attn.q_proj.weight +3 -0
- checkpoint/pytorch_model.bin/p164.model.layers.18.self_attn.k_proj.weight +3 -0
- checkpoint/pytorch_model.bin/p165.model.layers.18.self_attn.v_proj.weight +3 -0
- checkpoint/pytorch_model.bin/p166.model.layers.18.self_attn.o_proj.weight +3 -0
- checkpoint/pytorch_model.bin/p167.model.layers.18.mlp.gate_proj.weight +3 -0
- checkpoint/pytorch_model.bin/p168.model.layers.18.mlp.up_proj.weight +3 -0
- checkpoint/pytorch_model.bin/p169.model.layers.18.mlp.down_proj.weight +3 -0
- checkpoint/pytorch_model.bin/p17.model.layers.1.input_layernorm.weight +3 -0
- checkpoint/pytorch_model.bin/p170.model.layers.18.input_layernorm.weight +3 -0
- checkpoint/pytorch_model.bin/p171.model.layers.18.post_attention_layernorm.weight +3 -0
- checkpoint/pytorch_model.bin/p172.model.layers.19.self_attn.q_proj.weight +3 -0
- checkpoint/pytorch_model.bin/p173.model.layers.19.self_attn.k_proj.weight +3 -0
- checkpoint/pytorch_model.bin/p174.model.layers.19.self_attn.v_proj.weight +3 -0
- checkpoint/pytorch_model.bin/p175.model.layers.19.self_attn.o_proj.weight +3 -0
- checkpoint/pytorch_model.bin/p176.model.layers.19.mlp.gate_proj.weight +3 -0
checkpoint/pytorch_model.bin/p156.model.layers.17.self_attn.v_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:14a4074cc4cefa0ca33550e048f9032efacb033e1663353ac75dd13efbf8cf76
|
| 3 |
+
size 67109765
|
checkpoint/pytorch_model.bin/p157.model.layers.17.self_attn.o_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:56c49455eafd4fafb2c0056f0a18de82a69f8c703b785f718bc8d733e6ea3d05
|
| 3 |
+
size 67109765
|
checkpoint/pytorch_model.bin/p158.model.layers.17.mlp.gate_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6a37bd3157d8c7da781894a4ebabd590f0883854a690dfc8fcd42f229b9d75a9
|
| 3 |
+
size 180355964
|
checkpoint/pytorch_model.bin/p159.model.layers.17.mlp.up_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f6d5d5d9f188e53fefe29b5801b746def85fdbefbfbb86c9ee45748240fc9c6a
|
| 3 |
+
size 180355958
|
checkpoint/pytorch_model.bin/p16.model.layers.1.mlp.down_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:98ced61bc43ac6cd2f94543f14db73b190c165814c60ef798f9a781c8c7b1cdf
|
| 3 |
+
size 180355958
|
checkpoint/pytorch_model.bin/p160.model.layers.17.mlp.down_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:86190d1a40080403d28eb40a7e96913d9e1e48f4998018b9fa81ddf8288e684e
|
| 3 |
+
size 180355964
|
checkpoint/pytorch_model.bin/p161.model.layers.17.input_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:95f4025af369110095a7942bcba5eec6ab79555813c708fe773af47ee7aaf721
|
| 3 |
+
size 17282
|
checkpoint/pytorch_model.bin/p162.model.layers.17.post_attention_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:83b24c4e3caac37d981fc7d70999bde6adf0dff9c279d9011170109fc3b6bc7e
|
| 3 |
+
size 17309
|
checkpoint/pytorch_model.bin/p163.model.layers.18.self_attn.q_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c132d946183131160102f2aade2021a6ec4e9186ae05ee85ce332ca598339619
|
| 3 |
+
size 67109765
|
checkpoint/pytorch_model.bin/p164.model.layers.18.self_attn.k_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bbef6d1f7784bddc0e2322850d67b4726551e8d40c06898621af7150bc644912
|
| 3 |
+
size 67109765
|
checkpoint/pytorch_model.bin/p165.model.layers.18.self_attn.v_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:216b73287b5827f8ab7a6cd76cfe502206b1559fd6cee7291eb9ecf2cc7fde1d
|
| 3 |
+
size 67109765
|
checkpoint/pytorch_model.bin/p166.model.layers.18.self_attn.o_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2c3f1132b3073f58e84b9ebdc1cfb74212a6d44caa67ead1290730cc92691d83
|
| 3 |
+
size 67109765
|
checkpoint/pytorch_model.bin/p167.model.layers.18.mlp.gate_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fb2be9716f22fc6f073472f3b5ca12fb10cc50fb7139d3b7bcf023afabd43ffb
|
| 3 |
+
size 180355964
|
checkpoint/pytorch_model.bin/p168.model.layers.18.mlp.up_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6e3f78334d704fd94e83aeceb09232442615bdd2193d4d45d1fde9fd61e76240
|
| 3 |
+
size 180355958
|
checkpoint/pytorch_model.bin/p169.model.layers.18.mlp.down_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5c2ea4c034bae3e302c117d7255a8eb977eae06480f7e002881553b02e9268aa
|
| 3 |
+
size 180355964
|
checkpoint/pytorch_model.bin/p17.model.layers.1.input_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:425da3e5e4c0ee6e1e2d5d71defd97f01358acf6e7a7b3db949ec938ad0d9868
|
| 3 |
+
size 17276
|
checkpoint/pytorch_model.bin/p170.model.layers.18.input_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:be91d1d8789d56759c1c0d4bccecaaf525b5b2187fece145f3471ee162b2d075
|
| 3 |
+
size 17282
|
checkpoint/pytorch_model.bin/p171.model.layers.18.post_attention_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fdac8e7793cf8f18300bac0d31ab65815b8604a3b0ff34755293fa7584e957f9
|
| 3 |
+
size 17309
|
checkpoint/pytorch_model.bin/p172.model.layers.19.self_attn.q_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3c6bf5226a8eeefdf8557bdaf5323c3f61a44985833d705e3014c37f3bb589d5
|
| 3 |
+
size 67109765
|
checkpoint/pytorch_model.bin/p173.model.layers.19.self_attn.k_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d373d69e5a7f4f80fb2ab07f921858446d78131a120d1b0b150d2c7265fadaab
|
| 3 |
+
size 67109765
|
checkpoint/pytorch_model.bin/p174.model.layers.19.self_attn.v_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f173fdb2cc791a49a4ed918061d4e0fc8590e27a3863556d597f85a9072fa994
|
| 3 |
+
size 67109765
|
checkpoint/pytorch_model.bin/p175.model.layers.19.self_attn.o_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ad4f55b8794763516db442b0414881e09f3f6c7677b4c3b24031e4e7c1135220
|
| 3 |
+
size 67109765
|
checkpoint/pytorch_model.bin/p176.model.layers.19.mlp.gate_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:74aabb9caaa2fd68f349767472cf3ff7d5de514197caf4461ff5cce5de00af66
|
| 3 |
+
size 180355964
|