Instructions to use GrazittiInteractive/llama-2-13b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use GrazittiInteractive/llama-2-13b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="GrazittiInteractive/llama-2-13b")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("GrazittiInteractive/llama-2-13b")
model = AutoModelForCausalLM.from_pretrained("GrazittiInteractive/llama-2-13b", device_map="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use GrazittiInteractive/llama-2-13b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "GrazittiInteractive/llama-2-13b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "GrazittiInteractive/llama-2-13b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/GrazittiInteractive/llama-2-13b

SGLang

How to use GrazittiInteractive/llama-2-13b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "GrazittiInteractive/llama-2-13b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "GrazittiInteractive/llama-2-13b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "GrazittiInteractive/llama-2-13b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "GrazittiInteractive/llama-2-13b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use GrazittiInteractive/llama-2-13b with Docker Model Runner:
```
docker model run hf.co/GrazittiInteractive/llama-2-13b
```

llama-2-13b

Commit History

Update README.md

92d0cf5

GrazittiInteractive commited on Aug 10, 2023

Delete checklist.chk

5a8c5ed

GrazittiInteractive commited on Aug 10, 2023

Update config.json

28a2e7d

GrazittiInteractive commited on Aug 10, 2023

Update README.md

0408a9c

GrazittiInteractive commited on Aug 3, 2023

Update README.md

f97bb8a

GrazittiInteractive commited on Aug 2, 2023

Update README.md

b3a4785

GrazittiInteractive commited on Aug 2, 2023

Update README.md

7c3ae59

GrazittiInteractive commited on Aug 2, 2023

13 B ggml

38cbb0f

GrazittiInteractive commited on Aug 2, 2023

Delete ggml-model-q4_0.bin

4a48871

GrazittiInteractive commited on Aug 2, 2023

Update README.md

938bad1

GrazittiInteractive commited on Aug 2, 2023

Create config.json

85f53a5

GrazittiInteractive commited on Aug 2, 2023

Update README.md

c2a8c47

GrazittiInteractive commited on Aug 2, 2023

Update README.md

8ff8b5b

R commited on Aug 2, 2023

Update README.md

ef7a823

R commited on Aug 2, 2023

Update README.md

0d972dd

R commited on Aug 1, 2023

Update README.md

d27f10e

R commited on Aug 1, 2023

Update README.md

8466b6a

R commited on Aug 1, 2023

Upload 4 files

60557ce

R commited on Aug 1, 2023

initial commit

d84e48f

GrazittiInteractive commited on Aug 1, 2023

Commit History

Update README.md 92d0cf5

Delete checklist.chk 5a8c5ed

Update config.json 28a2e7d

Update README.md 0408a9c

Update README.md f97bb8a

Update README.md b3a4785

Update README.md 7c3ae59

13 B ggml 38cbb0f

Delete ggml-model-q4_0.bin 4a48871

Update README.md 938bad1

Create config.json 85f53a5

Update README.md c2a8c47

Update README.md 8ff8b5b

Update README.md ef7a823

Update README.md 0d972dd

Update README.md d27f10e

Update README.md 8466b6a

Upload 4 files 60557ce

initial commit d84e48f

Update README.md

92d0cf5

Delete checklist.chk

5a8c5ed

Update config.json

28a2e7d

Update README.md

0408a9c

Update README.md

f97bb8a

Update README.md

b3a4785

Update README.md

7c3ae59

13 B ggml

38cbb0f

Delete ggml-model-q4_0.bin

4a48871

Update README.md

938bad1

Create config.json

85f53a5

Update README.md

c2a8c47

Update README.md

8ff8b5b

Update README.md

ef7a823

Update README.md

0d972dd

Update README.md

d27f10e

Update README.md

8466b6a

Upload 4 files

60557ce

initial commit

d84e48f