Instructions to use jburtoft/TencentARC-LLaMA-Pro-8B-Neuron with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use jburtoft/TencentARC-LLaMA-Pro-8B-Neuron with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="jburtoft/TencentARC-LLaMA-Pro-8B-Neuron")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("jburtoft/TencentARC-LLaMA-Pro-8B-Neuron")
model = AutoModelForCausalLM.from_pretrained("jburtoft/TencentARC-LLaMA-Pro-8B-Neuron")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use jburtoft/TencentARC-LLaMA-Pro-8B-Neuron with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "jburtoft/TencentARC-LLaMA-Pro-8B-Neuron"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jburtoft/TencentARC-LLaMA-Pro-8B-Neuron",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/jburtoft/TencentARC-LLaMA-Pro-8B-Neuron

SGLang

How to use jburtoft/TencentARC-LLaMA-Pro-8B-Neuron with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "jburtoft/TencentARC-LLaMA-Pro-8B-Neuron" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jburtoft/TencentARC-LLaMA-Pro-8B-Neuron",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "jburtoft/TencentARC-LLaMA-Pro-8B-Neuron" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jburtoft/TencentARC-LLaMA-Pro-8B-Neuron",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use jburtoft/TencentARC-LLaMA-Pro-8B-Neuron with Docker Model Runner:
```
docker model run hf.co/jburtoft/TencentARC-LLaMA-Pro-8B-Neuron
```

jburtoft commited on Jan 8, 2024

Commit

d33f159

1 Parent(s): 3f6f069

48a95f9010fed9e9e60c4f55b88af3feca0a43bed38af895fc32e79d3dd4fe8a

Browse files

Files changed (22) hide show

checkpoint/pytorch_model.bin/p136.model.layers.15.self_attn.q_proj.weight +3 -0
checkpoint/pytorch_model.bin/p137.model.layers.15.self_attn.k_proj.weight +3 -0
checkpoint/pytorch_model.bin/p138.model.layers.15.self_attn.v_proj.weight +3 -0
checkpoint/pytorch_model.bin/p139.model.layers.15.self_attn.o_proj.weight +3 -0
checkpoint/pytorch_model.bin/p14.model.layers.1.mlp.gate_proj.weight +3 -0
checkpoint/pytorch_model.bin/p140.model.layers.15.mlp.gate_proj.weight +3 -0
checkpoint/pytorch_model.bin/p141.model.layers.15.mlp.up_proj.weight +3 -0
checkpoint/pytorch_model.bin/p142.model.layers.15.mlp.down_proj.weight +3 -0
checkpoint/pytorch_model.bin/p143.model.layers.15.input_layernorm.weight +3 -0
checkpoint/pytorch_model.bin/p144.model.layers.15.post_attention_layernorm.weight +3 -0
checkpoint/pytorch_model.bin/p145.model.layers.16.self_attn.q_proj.weight +3 -0
checkpoint/pytorch_model.bin/p146.model.layers.16.self_attn.k_proj.weight +3 -0
checkpoint/pytorch_model.bin/p147.model.layers.16.self_attn.v_proj.weight +3 -0
checkpoint/pytorch_model.bin/p148.model.layers.16.self_attn.o_proj.weight +3 -0
checkpoint/pytorch_model.bin/p149.model.layers.16.mlp.gate_proj.weight +3 -0
checkpoint/pytorch_model.bin/p15.model.layers.1.mlp.up_proj.weight +3 -0
checkpoint/pytorch_model.bin/p150.model.layers.16.mlp.up_proj.weight +3 -0
checkpoint/pytorch_model.bin/p151.model.layers.16.mlp.down_proj.weight +3 -0
checkpoint/pytorch_model.bin/p152.model.layers.16.input_layernorm.weight +3 -0
checkpoint/pytorch_model.bin/p153.model.layers.16.post_attention_layernorm.weight +3 -0
checkpoint/pytorch_model.bin/p154.model.layers.17.self_attn.q_proj.weight +3 -0
checkpoint/pytorch_model.bin/p155.model.layers.17.self_attn.k_proj.weight +3 -0

checkpoint/pytorch_model.bin/p136.model.layers.15.self_attn.q_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:aa33964150cebf35911cfb3e6142512b0617c92b1538ec7383f8197b65b7e0c7
+size 67109765

checkpoint/pytorch_model.bin/p137.model.layers.15.self_attn.k_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0d00a0ac0a232c3bb1858586f2c89ec03ab031ce9e42f2e0a7104189a61fb3d5
+size 67109765

checkpoint/pytorch_model.bin/p138.model.layers.15.self_attn.v_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0cafed58c8527880fcb0cd319b0d6b6fa5d1b6e344b71212a5f18f4a0b8011c1
+size 67109765

checkpoint/pytorch_model.bin/p139.model.layers.15.self_attn.o_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:56f1ddd40a8d33fee1538879e65e3e5b9643a1b855eb2801dd38710de5e5b28b
+size 67109765

checkpoint/pytorch_model.bin/p14.model.layers.1.mlp.gate_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:db908d11f827dddf0ec6f4908cfad39f6d258989e47fc7d6cef6b865c343c8f5
+size 180355958

checkpoint/pytorch_model.bin/p140.model.layers.15.mlp.gate_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c04346e3234381c36d92f079b0a31ed7d5011c09299e8df6233fd9422a70156d
+size 180355964

checkpoint/pytorch_model.bin/p141.model.layers.15.mlp.up_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:04a2b1c88add6862f156d7b4b9863bba5c91f214b93108f00c4b6a203849b37b
+size 180355958

checkpoint/pytorch_model.bin/p142.model.layers.15.mlp.down_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:074b587ff9ce07eefdc6222f48b9fbbee073885bceae453172d3d6bc5c243fc3
+size 180355964

checkpoint/pytorch_model.bin/p143.model.layers.15.input_layernorm.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f32f87b4a0af8af0192973bf55eb7c3d110bf595756abfe4e089ada6abc4d47c
+size 17282

checkpoint/pytorch_model.bin/p144.model.layers.15.post_attention_layernorm.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ea6bda1ec0209a0659444bdab9ea8c8c0dcfea8a06e68d98ead1144ea189efa9
+size 17309

checkpoint/pytorch_model.bin/p145.model.layers.16.self_attn.q_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fa64b63f850240c3066baa4739655355609ed08d5fa5932596cd8aef223b4e6d
+size 67109765

checkpoint/pytorch_model.bin/p146.model.layers.16.self_attn.k_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f2005804782b0a8014453bec7cfe07c5b7a1c63d26b6c2e2795fbcfacb03f8ca
+size 67109765

checkpoint/pytorch_model.bin/p147.model.layers.16.self_attn.v_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:73e938be0d525f5bc50b65c5915df0e8f704664b4c665ec1a6ae905450b4a3c2
+size 67109765

checkpoint/pytorch_model.bin/p148.model.layers.16.self_attn.o_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:986f446ca3fa34da71881b48a6c1c7bd28fd19643101ae85f984107106b25622
+size 67109765

checkpoint/pytorch_model.bin/p149.model.layers.16.mlp.gate_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:15a60be2340be80a8eba01c9da5f78abd80cb0d5069c6e7937a49bfcfedf8449
+size 180355964

checkpoint/pytorch_model.bin/p15.model.layers.1.mlp.up_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1456d343602db1f9cc9c798df5a1d6cddaceb632d76a386c5a65991eb15daf42
+size 180355952

checkpoint/pytorch_model.bin/p150.model.layers.16.mlp.up_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:34e6b79c39628fa6eff1536b69219fd8ac897efe4d2d40398d1496c8cd4234d1
+size 180355958

checkpoint/pytorch_model.bin/p151.model.layers.16.mlp.down_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:62738b2626ee2f112a31ec4903501c5386a453ef7cfdd2d820bad74e83acae7b
+size 180355964

checkpoint/pytorch_model.bin/p152.model.layers.16.input_layernorm.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b9def4e6df5791138bc20c750e2f5cc8cf9040e5d53a6b3f02cf7b0bce6d4875
+size 17282

checkpoint/pytorch_model.bin/p153.model.layers.16.post_attention_layernorm.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2f46918360a786504c56bb12f5cb08ec56815e2e44a523a8d4542d8e4b541c0e
+size 17309

checkpoint/pytorch_model.bin/p154.model.layers.17.self_attn.q_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5120649b0a75856a8b4984e388e3c6f05690e50cd541405532a459b6e4d3e5be
+size 67109765

checkpoint/pytorch_model.bin/p155.model.layers.17.self_attn.k_proj.weight ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:29a4de83b6777f0d85a72ce534afd508c7fa81c587991b2e329467c672c035ed
+size 67109765