Instructions to use learnanything/llama-7b-huggingface with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use learnanything/llama-7b-huggingface with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="learnanything/llama-7b-huggingface")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("learnanything/llama-7b-huggingface")
model = AutoModelForCausalLM.from_pretrained("learnanything/llama-7b-huggingface", device_map="auto")

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use learnanything/llama-7b-huggingface with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "learnanything/llama-7b-huggingface"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "learnanything/llama-7b-huggingface",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/learnanything/llama-7b-huggingface

SGLang

How to use learnanything/llama-7b-huggingface with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "learnanything/llama-7b-huggingface" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "learnanything/llama-7b-huggingface",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "learnanything/llama-7b-huggingface" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "learnanything/llama-7b-huggingface",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use learnanything/llama-7b-huggingface with Docker Model Runner:
```
docker model run hf.co/learnanything/llama-7b-huggingface
```

llama-7b-huggingface

Commit History

Update README.md

c06ff70

learnanything commited on Apr 17, 2023

Update README.md

ca5c1b3

learnanything commited on Apr 17, 2023

Update README.md

28775d1

learnanything commited on Apr 17, 2023

Update README.md

73c6120

learnanything commited on Apr 17, 2023

Update README.md

9caa512

learnanything commited on Apr 17, 2023

LLaMA-7B for Huggingface (AutoClass Supported)

f1d7847

learnanything commited on Apr 17, 2023

Update README.md

06e294d

learnanything commited on Apr 17, 2023

Update README.md

b6db05e

learnanything commited on Apr 17, 2023

Update README.md

cc697ba

learnanything commited on Apr 17, 2023

model config and tokenizer config adapted for transformers 4.28.x

ea81e7d

nan commited on Apr 17, 2023

update generation config

5f98eef

dustydecapod commited on Mar 9, 2023

add latest conversion results

9c4cbdb

dustydecapod commited on Mar 8, 2023

remove previous, unreliable conversion

47dc237

dustydecapod commited on Mar 8, 2023

update readme

84fd0de

dustydecapod commited on Mar 5, 2023

Initial import of LLaMA-7B

0a1e291

dustydecapod commited on Mar 5, 2023

initial commit

a04e603

dustydecapod commited on Mar 5, 2023

Commit History

Update README.md c06ff70

Update README.md ca5c1b3

Update README.md 28775d1

Update README.md 73c6120

Update README.md 9caa512

LLaMA-7B for Huggingface (AutoClass Supported) f1d7847

Update README.md 06e294d

Update README.md b6db05e

Update README.md cc697ba

model config and tokenizer config adapted for transformers 4.28.x ea81e7d

update generation config 5f98eef

add latest conversion results 9c4cbdb

remove previous, unreliable conversion 47dc237

update readme 84fd0de

Initial import of LLaMA-7B 0a1e291

initial commit a04e603

Update README.md

c06ff70

Update README.md

ca5c1b3

Update README.md

28775d1

Update README.md

73c6120

Update README.md

9caa512

LLaMA-7B for Huggingface (AutoClass Supported)

f1d7847

Update README.md

06e294d

Update README.md

b6db05e

Update README.md

cc697ba

model config and tokenizer config adapted for transformers 4.28.x

ea81e7d

update generation config

5f98eef

add latest conversion results

9c4cbdb

remove previous, unreliable conversion

47dc237

update readme

84fd0de

Initial import of LLaMA-7B

0a1e291

initial commit

a04e603