Instructions to use jburtoft/Llama-2-7b-split-transformers with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use jburtoft/Llama-2-7b-split-transformers with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="jburtoft/Llama-2-7b-split-transformers")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("jburtoft/Llama-2-7b-split-transformers") model = AutoModelForCausalLM.from_pretrained("jburtoft/Llama-2-7b-split-transformers") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use jburtoft/Llama-2-7b-split-transformers with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "jburtoft/Llama-2-7b-split-transformers" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jburtoft/Llama-2-7b-split-transformers", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/jburtoft/Llama-2-7b-split-transformers
- SGLang
How to use jburtoft/Llama-2-7b-split-transformers with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "jburtoft/Llama-2-7b-split-transformers" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jburtoft/Llama-2-7b-split-transformers", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "jburtoft/Llama-2-7b-split-transformers" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jburtoft/Llama-2-7b-split-transformers", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use jburtoft/Llama-2-7b-split-transformers with Docker Model Runner:
docker model run hf.co/jburtoft/Llama-2-7b-split-transformers
998b1c4d5092e36a29099a8826cd6679daea4c8294537a4aa669e6e6d887a35a
Browse files- pytorch_model.bin/p77.model.layers.8.mlp.gate_proj.weight +3 -0
- pytorch_model.bin/p78.model.layers.8.mlp.up_proj.weight +3 -0
- pytorch_model.bin/p79.model.layers.8.mlp.down_proj.weight +3 -0
- pytorch_model.bin/p8.model.layers.0.input_layernorm.weight +3 -0
- pytorch_model.bin/p80.model.layers.8.input_layernorm.weight +3 -0
- pytorch_model.bin/p81.model.layers.8.post_attention_layernorm.weight +3 -0
- pytorch_model.bin/p82.model.layers.9.self_attn.q_proj.weight +3 -0
- pytorch_model.bin/p83.model.layers.9.self_attn.k_proj.weight +3 -0
- pytorch_model.bin/p84.model.layers.9.self_attn.v_proj.weight +3 -0
- pytorch_model.bin/p85.model.layers.9.self_attn.o_proj.weight +3 -0
- pytorch_model.bin/p86.model.layers.9.mlp.gate_proj.weight +3 -0
- pytorch_model.bin/p87.model.layers.9.mlp.up_proj.weight +3 -0
- pytorch_model.bin/p88.model.layers.9.mlp.down_proj.weight +3 -0
- pytorch_model.bin/p89.model.layers.9.input_layernorm.weight +3 -0
- pytorch_model.bin/p9.model.layers.0.post_attention_layernorm.weight +3 -0
- pytorch_model.bin/p90.model.layers.9.post_attention_layernorm.weight +3 -0
- pytorch_model.bin/p91.model.layers.10.self_attn.q_proj.weight +3 -0
- pytorch_model.bin/p92.model.layers.10.self_attn.k_proj.weight +3 -0
- pytorch_model.bin/p93.model.layers.10.self_attn.v_proj.weight +3 -0
- pytorch_model.bin/p94.model.layers.10.self_attn.o_proj.weight +3 -0
- pytorch_model.bin/p95.model.layers.10.mlp.gate_proj.weight +3 -0
- pytorch_model.bin/p96.model.layers.10.mlp.up_proj.weight +3 -0
pytorch_model.bin/p77.model.layers.8.mlp.gate_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6266be1dae85651bb1dd490d62eacc87e05daea544414d4e68546503e1c37574
|
| 3 |
+
size 180355958
|
pytorch_model.bin/p78.model.layers.8.mlp.up_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b290000536fcdc58af1e23acf40cb4184bec29b3e0c532b2cbcd281a409a2eda
|
| 3 |
+
size 180355952
|
pytorch_model.bin/p79.model.layers.8.mlp.down_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a6af41c4f5dc8739c5b0c0dfe9db228d1f9667e9eb223b02614fc95cf104163a
|
| 3 |
+
size 180355958
|
pytorch_model.bin/p8.model.layers.0.input_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8722a413e29dc0a9314022e0ff4503faf0d791cc171529ef46c17787576bfbe2
|
| 3 |
+
size 17273
|
pytorch_model.bin/p80.model.layers.8.input_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:578e4cda501a5c3a5285d9fe81a836d48d3cfb99ee4bee2ed57e98dc06fd09c8
|
| 3 |
+
size 17276
|
pytorch_model.bin/p81.model.layers.8.post_attention_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:17d6a5ec9a916319b8fcfda563b1548d3f99b54786a852e58586c47a5552cb18
|
| 3 |
+
size 17303
|
pytorch_model.bin/p82.model.layers.9.self_attn.q_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d189480a221d069489a679d69dc677378fe0942fc7ae76bd5db891978b7f9a65
|
| 3 |
+
size 67109759
|
pytorch_model.bin/p83.model.layers.9.self_attn.k_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0d0efadd85fa1a81325a676e3a1af0b5401c6007655b081b0bbf9be90eb17d6b
|
| 3 |
+
size 67109759
|
pytorch_model.bin/p84.model.layers.9.self_attn.v_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:caa50f40a6d27e0a4c549d3ef57070b11765a8b4f3852b588b8512e8a93c624e
|
| 3 |
+
size 67109759
|
pytorch_model.bin/p85.model.layers.9.self_attn.o_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e2a8b3e66ae17c0557b3dbebc31e3d3504c363e538e57b15aef3a4a1cc616ed8
|
| 3 |
+
size 67109759
|
pytorch_model.bin/p86.model.layers.9.mlp.gate_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fd90c33e7402ecffa111e6866095978476ba8a1e9dcd48553ca6ce5628417254
|
| 3 |
+
size 180355958
|
pytorch_model.bin/p87.model.layers.9.mlp.up_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9a0a81b5d6f734bcf4dc31475dc99280dfce205d5278f8ec6849136d5557ca4b
|
| 3 |
+
size 180355952
|
pytorch_model.bin/p88.model.layers.9.mlp.down_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:be77227b34ff6d401bdbbc3749a44ab058c0cc1c0a0a55c8ca9b6e6c8a4074f9
|
| 3 |
+
size 180355958
|
pytorch_model.bin/p89.model.layers.9.input_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:054edbe807f70d685de965a837a2ebad55447b1c3cb8d39c0f7ecc0ee568feff
|
| 3 |
+
size 17276
|
pytorch_model.bin/p9.model.layers.0.post_attention_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4805ab6fafe1b0894b91c2fcfcd9b74e6cf2c584b0a292af5c45500f2d5b6f2c
|
| 3 |
+
size 17300
|
pytorch_model.bin/p90.model.layers.9.post_attention_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:311d4abdc7f61e704f17df857f046b656790b2a55bc3e83e0b977d626894d3f0
|
| 3 |
+
size 17303
|
pytorch_model.bin/p91.model.layers.10.self_attn.q_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:66a9fe48424b68fd7d8785f6cdfa6c66a03528be6607cd64f7c1e0ca07521e70
|
| 3 |
+
size 67109762
|
pytorch_model.bin/p92.model.layers.10.self_attn.k_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:78f41ad8ea094065aef9be3ceafc03580c9cf09661962cebca2f1c49ea99e4d8
|
| 3 |
+
size 67109762
|
pytorch_model.bin/p93.model.layers.10.self_attn.v_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:46341bd103524ceb780a7d7437cea9e6ee1a49e89ea56d409836abdce432d5c1
|
| 3 |
+
size 67109762
|
pytorch_model.bin/p94.model.layers.10.self_attn.o_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a429d1f860f666362306e048818229b0e7deba6dcdb61eb43294ad8ceddefa9a
|
| 3 |
+
size 67109762
|
pytorch_model.bin/p95.model.layers.10.mlp.gate_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:998360b6efe47b7700ffa72c51f9887670d96e108ad070aa56bf4f3b0d271ccd
|
| 3 |
+
size 180355961
|
pytorch_model.bin/p96.model.layers.10.mlp.up_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:46c7439b8bbb66c38f709a7bd017a53580eb186ff7a71cc6d5fcc6e67f7e732f
|
| 3 |
+
size 180355955
|