Text Generation
Transformers
Safetensors
English
gemma
function calling
on-device language model
android
conversational
text-generation-inference
Instructions to use NexaAI/Octopus-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use NexaAI/Octopus-v2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="NexaAI/Octopus-v2") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("NexaAI/Octopus-v2") model = AutoModelForCausalLM.from_pretrained("NexaAI/Octopus-v2") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use NexaAI/Octopus-v2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "NexaAI/Octopus-v2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "NexaAI/Octopus-v2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/NexaAI/Octopus-v2
- SGLang
How to use NexaAI/Octopus-v2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "NexaAI/Octopus-v2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "NexaAI/Octopus-v2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "NexaAI/Octopus-v2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "NexaAI/Octopus-v2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use NexaAI/Octopus-v2 with Docker Model Runner:
docker model run hf.co/NexaAI/Octopus-v2
Update README.md
Browse files
README.md
CHANGED
|
@@ -16,8 +16,19 @@ language:
|
|
| 16 |
---
|
| 17 |
# Octopus V2: On-device language model for super agent
|
| 18 |
|
| 19 |
-
|
|
|
|
| 20 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
We are a very small team with many work. Please give us more time to prepare the code, and we will **open source** it. We hope Octopus v2 model will be helpful for you. Let's democratize AI agents for everyone. We've received many requests from car industry, health care, financial system etc. Octopus model is able to be applied to **any function**, and you can start to think about it now.
|
| 22 |
<p align="center">
|
| 23 |
- <a href="https://www.nexa4ai.com/" target="_blank">Nexa AI Product</a>
|
|
@@ -29,7 +40,7 @@ We are a very small team with many work. Please give us more time to prepare the
|
|
| 29 |
<a><img src="Octopus-logo.jpeg" alt="nexa-octopus" style="width: 40%; min-width: 300px; display: block; margin: auto;"></a>
|
| 30 |
</p>
|
| 31 |
|
| 32 |
-
##
|
| 33 |
|
| 34 |
Octopus-V2-2B, an advanced open-source language model with 2 billion parameters, represents Nexa AI's research breakthrough in the application of large language models (LLMs) for function calling, specifically tailored for Android APIs. Unlike Retrieval-Augmented Generation (RAG) methods, which require detailed descriptions of potential function arguments—sometimes needing up to tens of thousands of input tokens—Octopus-V2-2B introduces a unique **functional token** strategy for both its training and inference stages. This approach not only allows it to achieve performance levels comparable to GPT-4 but also significantly enhances its inference speed beyond that of RAG-based methods, making it especially beneficial for edge computing devices.
|
| 35 |
|
|
|
|
| 16 |
---
|
| 17 |
# Octopus V2: On-device language model for super agent
|
| 18 |
|
| 19 |
+
## Octopus V3 Release
|
| 20 |
+
We are excited to announce that Octopus v3 is now available! check our [technical report](https://lnkd.in/dxk54W5r) and [Octopus V3 tweet](https://twitter.com/nexa4ai/status/1780783383737676236)!
|
| 21 |
|
| 22 |
+
Key Features of Octopus v3:
|
| 23 |
+
- **Efficiency**: **Sub-billion** parameters, making it less than half the size of its predecessor, Octopus v2.
|
| 24 |
+
- **Multi-Modal Capabilities**: Proceed both text and images inputs.
|
| 25 |
+
- **Speed and Accuracy**: Incorporate our **patented** functional token technology, achieving function calling accuracy on par with GPT-4V and GPT-4.
|
| 26 |
+
- **Multilingual Support**: Simultaneous support for English and Mandarin.
|
| 27 |
+
|
| 28 |
+
Check the Octopus V3 demo video for [Android and iOS](https://octopus3.nexa4ai.com/).
|
| 29 |
+
|
| 30 |
+
|
| 31 |
+
## Octopus V2
|
| 32 |
We are a very small team with many work. Please give us more time to prepare the code, and we will **open source** it. We hope Octopus v2 model will be helpful for you. Let's democratize AI agents for everyone. We've received many requests from car industry, health care, financial system etc. Octopus model is able to be applied to **any function**, and you can start to think about it now.
|
| 33 |
<p align="center">
|
| 34 |
- <a href="https://www.nexa4ai.com/" target="_blank">Nexa AI Product</a>
|
|
|
|
| 40 |
<a><img src="Octopus-logo.jpeg" alt="nexa-octopus" style="width: 40%; min-width: 300px; display: block; margin: auto;"></a>
|
| 41 |
</p>
|
| 42 |
|
| 43 |
+
## Introduction
|
| 44 |
|
| 45 |
Octopus-V2-2B, an advanced open-source language model with 2 billion parameters, represents Nexa AI's research breakthrough in the application of large language models (LLMs) for function calling, specifically tailored for Android APIs. Unlike Retrieval-Augmented Generation (RAG) methods, which require detailed descriptions of potential function arguments—sometimes needing up to tens of thousands of input tokens—Octopus-V2-2B introduces a unique **functional token** strategy for both its training and inference stages. This approach not only allows it to achieve performance levels comparable to GPT-4 but also significantly enhances its inference speed beyond that of RAG-based methods, making it especially beneficial for edge computing devices.
|
| 46 |
|