Embedding API Reference
Overview
AxonHub provides comprehensive support for text and multimodal embedding generation through OpenAI-compatible and Jina AI-specific APIs.
Key Benefits
- OpenAI Compatibility: Use existing OpenAI SDKs without modification
- Jina AI Support: Native Jina embedding format support for specialized use cases
- Multiple Input Types: Support for single text, text arrays, token arrays, and multiple token arrays
- Flexible Output Formats: Choose between float arrays or base64-encoded embeddings
Supported Endpoints
Endpoints:
POST /v1/embeddings- OpenAI-compatible embedding APIPOST /jina/v1/embeddings- Jina AI-specific embedding API
Request Format
{
"input": "The text to embed",
"model": "text-embedding-3-small",
"encoding_format": "float",
"dimensions": 1536,
"user": "user-id"
}
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
input |
string | string[] | number[] | number[][] | ✅ | The text(s) to embed. Can be a single string, array of strings, token array, or multiple token arrays. |
model |
string | ✅ | The model to use for embedding generation. |
encoding_format |
string | ❌ | Format to return embeddings in. Either float or base64. Default: float. |
dimensions |
integer | ❌ | Number of dimensions for the output embeddings. |
user |
string | ❌ | Unique identifier for the end-user. |
Jina-Specific Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
task |
string | ❌ | Task type for Jina embeddings. Options: text-matching, retrieval.query, retrieval.passage, separation, classification, none. |
Response Format
{
"object": "list",
"data": [
{
"object": "embedding",
"embedding": [0.123, 0.456, ...],
"index": 0
}
],
"model": "text-embedding-3-small",
"usage": {
"prompt_tokens": 4,
"total_tokens": 4
}
}
Examples
OpenAI SDK (Python)
import openai
client = openai.OpenAI(
api_key="your-axonhub-api-key",
base_url="http://localhost:8090/v1"
)
response = client.embeddings.create(
input="Hello, world!",
model="text-embedding-3-small"
)
print(response.data[0].embedding[:5]) # First 5 dimensions
OpenAI SDK (Go)
package main
import (
"context"
"fmt"
"log"
"github.com/openai/openai-go"
"github.com/openai/openai-go/option"
)
func main() {
client := openai.NewClient(
option.WithAPIKey("your-axonhub-api-key"),
option.WithBaseURL("http://localhost:8090/v1"),
)
embedding, err := client.Embeddings.New(context.TODO(), openai.EmbeddingNewParams{
Input: openai.Union[string](openai.String("Hello, world!")),
Model: openai.String("text-embedding-3-small"),
option.WithHeader("AH-Trace-Id", "trace-example-123"),
option.WithHeader("AH-Thread-Id", "thread-example-abc"),
})
if err != nil {
log.Fatal(err)
}
fmt.Printf("Embedding dimensions: %d\n", len(embedding.Data[0].Embedding))
fmt.Printf("First 5 values: %v\n", embedding.Data[0].Embedding[:5])
}
Multiple Texts
response = client.embeddings.create(
input=["Hello, world!", "How are you?"],
model="text-embedding-3-small"
)
for i, data in enumerate(response.data):
print(f"Text {i}: {data.embedding[:3]}...")
Jina-Specific Task
import requests
response = requests.post(
"http://localhost:8090/jina/v1/embeddings",
headers={
"Authorization": "Bearer your-axonhub-api-key",
"Content-Type": "application/json"
},
json={
"input": "What is machine learning?",
"model": "jina-embeddings-v2-base-en",
"task": "retrieval.query"
}
)
result = response.json()
print(result["data"][0]["embedding"][:5])
Authentication
The Embedding API uses Bearer token authentication:
- Header:
Authorization: Bearer <your-api-key>
The API keys are managed through AxonHub's API Key management system.
Best Practices
- Use Tracing Headers: Include
AH-Trace-IdandAH-Thread-Idheaders for better observability - Batch Requests: When embedding multiple texts, send them in a single request for better performance
- Choose Appropriate Dimensions: Use the
dimensionsparameter to reduce embedding size if full dimensionality isn't needed - Select Proper Encoding: Use
base64encoding if you need to transmit embeddings over the network to reduce payload size - Jina Task Types: When using Jina embeddings, select the appropriate
tasktype for your use case to optimize retrieval quality