Instructions to use NexaAI/Octopus-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use NexaAI/Octopus-v2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="NexaAI/Octopus-v2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("NexaAI/Octopus-v2")
model = AutoModelForCausalLM.from_pretrained("NexaAI/Octopus-v2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use NexaAI/Octopus-v2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "NexaAI/Octopus-v2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NexaAI/Octopus-v2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/NexaAI/Octopus-v2

SGLang

How to use NexaAI/Octopus-v2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "NexaAI/Octopus-v2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NexaAI/Octopus-v2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "NexaAI/Octopus-v2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NexaAI/Octopus-v2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use NexaAI/Octopus-v2 with Docker Model Runner:
```
docker model run hf.co/NexaAI/Octopus-v2
```

Info about adding custom functions?

#14

by TeddyB - opened Apr 12, 2024

Discussion

TeddyB

Apr 12, 2024

Hi, first of all, I wanted to congratulate you guys on the research. Really impressive stuff!

I was wondering if you could provide me with some info about how I can start the process of adding my own functions to the vocabulary of the model.

Say I have 20 new functions I would like to teach the model. What would be the steps that you take to get this done?

alexchen4ai

Nexa AI org Apr 13, 2024

Hi, as along as you have a well-defined task. You could formulate the function. You can label the data manually (costly as well), or you can use synthetic data. Please refer to our paper

TeddyB

Apr 13, 2024

I read through it and feel like I'm missing information about:

How exactly was the vocabulary extended?
From what I found online, there are multiple ways to extend the vocabulary. So I was wondering what exactly did you guys do?
After extending the vocabulary, do the embedding and lm_head layers need to be retrained?
I'm the paper it's mentioned that after extending the vocabulary, you go through a round of fine-tuning. But from my understanding, fine-tuning won't train the lm_head and embedding layers.
So what was done to train the above 2 mentioned layers?

TeddyB changed discussion status to closed Apr 13, 2024

TeddyB changed discussion status to open Apr 13, 2024

alexchen4ai

Nexa AI org Apr 14, 2024

Please stay tuned. We will open source code later. For earliest notification, consider to join our waitlist: https://www.nexa4ai.com/contact

sundar008

Apr 20, 2024

Curious to check the opensource codebase out soon to know the details!

zackli4ai

Nexa AI org May 6, 2024

Hi @TeddyB

We add functional tokens to vocabulary, see
https://huggingface.co/NexaAIDev/Octopus-v2/blob/main/tokenizer_config.json
We will prepare a training pipeline on AWS / Google cloud soon for customized API training requirements

TeddyB

May 6, 2024

Hi @zackli4ai ,

Thanks for the info, I see the new special tokens added to the tokenizer now

I have some follow-up questions:

Have you tried your technique of adding new functional tokens to other base models, like MS Phi-3 Mini or Meta Llama 2 8b?
Are you also planning on releasing the dataset you used to train the model?

zackli4ai

Nexa AI org May 6, 2024

@TeddyB

Yes, Octopus-V4 is based on Phi-3 : https://huggingface.co/NexaAIDev/Octopus-v4
We are building a training pipeline on AWS / Google Cloud
thanks for questions

Jiangha1

Mar 13, 2025

@TeddyB

Yes, Octopus-V4 is based on Phi-3 : https://huggingface.co/NexaAIDev/Octopus-v4

We are building a training pipeline on AWS / Google Cloud
thanks for questions
Hi, are you still working on that?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment