Instructions to use trendmicro-ailab/Llama-Primus-Merged with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use trendmicro-ailab/Llama-Primus-Merged with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="trendmicro-ailab/Llama-Primus-Merged")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("trendmicro-ailab/Llama-Primus-Merged")
model = AutoModelForCausalLM.from_pretrained("trendmicro-ailab/Llama-Primus-Merged")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use trendmicro-ailab/Llama-Primus-Merged with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "trendmicro-ailab/Llama-Primus-Merged"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "trendmicro-ailab/Llama-Primus-Merged",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/trendmicro-ailab/Llama-Primus-Merged

SGLang

How to use trendmicro-ailab/Llama-Primus-Merged with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "trendmicro-ailab/Llama-Primus-Merged" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "trendmicro-ailab/Llama-Primus-Merged",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "trendmicro-ailab/Llama-Primus-Merged" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "trendmicro-ailab/Llama-Primus-Merged",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use trendmicro-ailab/Llama-Primus-Merged with Docker Model Runner:
```
docker model run hf.co/trendmicro-ailab/Llama-Primus-Merged
```

Llama-Primus-Merged

Commit History

Update README.md

e998256
verified

youyaoching commited on Mar 4, 2025

Update README.md

c37c153
verified

youyaoching commited on Mar 2, 2025

Update README.md

738a211
verified

youyaoching commited on Feb 26, 2025

Update README.md

33c79b4
verified

youyaoching commited on Feb 26, 2025

Update README.md

037ecd3
verified

youyaoching commited on Feb 22, 2025

Update README.md

b83c25e
verified

youyaoching commited on Feb 21, 2025

Update README.md

05823f1
verified

youyaoching commited on Feb 21, 2025

Update README.md

51359ee
verified

youyaoching commited on Feb 21, 2025

Update README.md

b5e47a5
verified

youyaoching commited on Feb 21, 2025

Update README.md

54f7795
verified

youyaoching commited on Feb 18, 2025

Update README.md

fa7acc5
verified

youyaoching commited on Feb 18, 2025

Update README.md

3ee084d
verified

youyaoching commited on Feb 18, 2025

Update README.md

5a0c6ad
verified

youyaoching commited on Feb 18, 2025

Update README.md

6b285d8
verified

youyaoching commited on Feb 17, 2025

Delete overview.png

71ce888
verified

youyaoching commited on Feb 17, 2025

Update README.md

1103e88
verified

youyaoching commited on Feb 17, 2025

Update README.md

d6da083
verified

youyaoching commited on Feb 17, 2025

Update README.md

9acb264
verified

youyaoching commited on Feb 17, 2025

Update README.md

c655e64
verified

youyaoching commited on Feb 17, 2025

Update README.md

3b44507
verified

youyaoching commited on Feb 17, 2025