Instructions to use LLaMAX/LLaMAX3-8B-Alpaca with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use LLaMAX/LLaMAX3-8B-Alpaca with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="LLaMAX/LLaMAX3-8B-Alpaca")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("LLaMAX/LLaMAX3-8B-Alpaca")
model = AutoModelForCausalLM.from_pretrained("LLaMAX/LLaMAX3-8B-Alpaca")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use LLaMAX/LLaMAX3-8B-Alpaca with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "LLaMAX/LLaMAX3-8B-Alpaca"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "LLaMAX/LLaMAX3-8B-Alpaca",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/LLaMAX/LLaMAX3-8B-Alpaca

SGLang

How to use LLaMAX/LLaMAX3-8B-Alpaca with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "LLaMAX/LLaMAX3-8B-Alpaca" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "LLaMAX/LLaMAX3-8B-Alpaca",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "LLaMAX/LLaMAX3-8B-Alpaca" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "LLaMAX/LLaMAX3-8B-Alpaca",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use LLaMAX/LLaMAX3-8B-Alpaca with Docker Model Runner:
```
docker model run hf.co/LLaMAX/LLaMAX3-8B-Alpaca
```

LLaMAX3-8B-Alpaca

Commit History

update tokenizer_config.json

111f2fa
verified

LLaMAX commited on May 12, 2025

Update tokenizer_config.json

c18f5af
verified

LLaMAX commited on May 6, 2025

Update README.md

2536ecd
verified

LLaMAX commited on Dec 6, 2024

Update README.md

b4fac5f
verified

LLaMAX commited on Jul 26, 2024

Update README.md

c22eb79
verified

LLaMAX commited on Jul 21, 2024

Update README.md

5a1c1b1
verified

LLaMAX commited on Jul 16, 2024

Update README.md

d6111a7
verified

LLaMAX commited on Jul 12, 2024

Update README.md

5378f8f
verified

LLaMAX commited on Jul 12, 2024

update readme

66cfc84

huangtao6 commited on Jul 9, 2024

update readme

81fff73

huangtao6 commited on Jul 9, 2024

update README

4381330

Lego-MT commited on Jul 8, 2024

update README

71650bd

Lego-MT commited on Jul 8, 2024

First model version

49193eb

Lego-MT commited on Jun 25, 2024

initial commit

caf2f41
verified

TransLLaMA commited on Jun 25, 2024

Commit History

update tokenizer_config.json 111f2fa verified

Update tokenizer_config.json c18f5af verified

Update README.md 2536ecd verified

Update README.md b4fac5f verified

Update README.md c22eb79 verified

Update README.md 5a1c1b1 verified

Update README.md d6111a7 verified

Update README.md 5378f8f verified

update readme 66cfc84

update readme 81fff73

update README 4381330

update README 71650bd

First model version 49193eb

initial commit caf2f41 verified