Text Generation
Transformers
English
qwen2
code-generation
python
fine-tuning
Qwen
tools
agent-framework
multi-agent
conversational
Eval Results (legacy)
Instructions to use my-ai-stack/Stack-2-9-finetuned with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use my-ai-stack/Stack-2-9-finetuned with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="my-ai-stack/Stack-2-9-finetuned") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("my-ai-stack/Stack-2-9-finetuned") model = AutoModelForCausalLM.from_pretrained("my-ai-stack/Stack-2-9-finetuned") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use my-ai-stack/Stack-2-9-finetuned with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "my-ai-stack/Stack-2-9-finetuned" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "my-ai-stack/Stack-2-9-finetuned", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/my-ai-stack/Stack-2-9-finetuned
- SGLang
How to use my-ai-stack/Stack-2-9-finetuned with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "my-ai-stack/Stack-2-9-finetuned" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "my-ai-stack/Stack-2-9-finetuned", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "my-ai-stack/Stack-2-9-finetuned" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "my-ai-stack/Stack-2-9-finetuned", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use my-ai-stack/Stack-2-9-finetuned with Docker Model Runner:
docker model run hf.co/my-ai-stack/Stack-2-9-finetuned
| # Stack 2.9 API Documentation | |
| Complete API reference for integrating Stack 2.9 into your applications. | |
| ## Table of Contents | |
| - [Overview](#overview) | |
| - [Authentication](#authentication) | |
| - [Rate Limits](#rate-limits) | |
| - [REST Endpoints](#rest-endpoints) | |
| - [WebSocket Streaming](#websocket-streaming) | |
| - [Request/Response Formats](#requestresponse-formats) | |
| - [Error Handling](#error-handling) | |
| - [SDKs and Examples](#sdks-and-examples) | |
| --- | |
| ## Overview | |
| Stack 2.9 provides an OpenAI-compatible API for seamless integration with existing tools and workflows. | |
| ### Base URL | |
| ``` | |
| Production: https://api.stack2.9.openclaw.org/v1 | |
| Local: http://localhost:3000/v1 | |
| ``` | |
| ### API Versioning | |
| The current API version is `v1`. Version information is included in all responses. | |
| ```json | |
| { | |
| "api_version": "1.0", | |
| "deprecation_date": null | |
| } | |
| ``` | |
| --- | |
| ## Authentication | |
| ### API Key Authentication | |
| Include your API key in the `Authorization` header: | |
| ```bash | |
| curl -H "Authorization: Bearer YOUR_API_KEY" \ | |
| -H "Content-Type: application/json" \ | |
| https://api.stack2.9.openclaw.org/v1/chat/completions | |
| ``` | |
| ### Obtaining an API Key | |
| 1. **Self-hosted:** Set `API_KEY` environment variable | |
| 2. **Cloud:** Sign up at [stack2.9.openclaw.org](https://stack2.9.openclaw.org) | |
| ### Authentication Errors | |
| | Status | Error Type | Description | | |
| |--------|------------|-------------| | |
| | 401 | `invalid_api_key` | API key is missing or invalid | | |
| | 403 | `account_disabled` | Account has been disabled | | |
| | 429 | `rate_limit_exceeded` | Too many requests | | |
| --- | |
| ## Rate Limits | |
| ### Tier Limits | |
| | Tier | Requests/min | Tokens/day | Concurrent | WebSocket | | |
| |------|-------------|------------|------------|-----------| | |
| | **Free** | 100 | 100,000 | 5 | ✅ | | |
| | **Pro** | 1,000 | 10,000,000 | 20 | ✅ | | |
| | **Enterprise** | Custom | Custom | Custom | ✅ | | |
| ### Rate Limit Headers | |
| Every response includes rate limit information: | |
| ``` | |
| X-RateLimit-Limit: 100 | |
| X-RateLimit-Remaining: 95 | |
| X-RateLimit-Reset: 1640995200 | |
| X-RateLimit-Used: 5 | |
| ``` | |
| ### Handling Rate Limits | |
| ```python | |
| import time | |
| import openai | |
| client = openai.OpenAI(api_key="your-api-key") | |
| for i in range(100): | |
| try: | |
| response = client.chat.completions.create( | |
| model="qwen/qwen2.5-coder-32b", | |
| messages=[{"role": "user", "content": "Hello"}] | |
| ) | |
| except openai.RateLimitError: | |
| time.sleep(60) # Wait 1 minute | |
| continue | |
| ``` | |
| --- | |
| ## REST Endpoints | |
| ### Chat Completions | |
| **Endpoint:** `POST /chat/completions` | |
| Generate chat completions with optional streaming. | |
| #### Request Body | |
| ```json | |
| { | |
| "model": "qwen/qwen2.5-coder-32b", | |
| "messages": [ | |
| { | |
| "role": "system", | |
| "content": "You are a helpful coding assistant." | |
| }, | |
| { | |
| "role": "user", | |
| "content": "Write a Python function to calculate Fibonacci numbers." | |
| } | |
| ], | |
| "temperature": 0.7, | |
| "max_tokens": 1000, | |
| "top_p": 1.0, | |
| "frequency_penalty": 0.0, | |
| "presence_penalty": 0.0, | |
| "stream": false, | |
| "stop": null, | |
| "tools": [], | |
| "tool_choice": "auto", | |
| "user": "user-identifier" | |
| } | |
| ``` | |
| #### Parameters | |
| | Parameter | Type | Required | Default | Description | | |
| |-----------|------|----------|---------|-------------| | |
| | `model` | string | ✅ | - | Model identifier | | |
| | `messages` | array | ✅ | - | Conversation messages | | |
| | `temperature` | number | ❌ | 0.7 | Sampling temperature (0-2) | | |
| | `max_tokens` | integer | ❌ | 1000 | Maximum tokens to generate | | |
| | `top_p` | number | ❌ | 1.0 | Nucleus sampling | | |
| | `frequency_penalty` | number | ❌ | 0.0 | Frequency penalty (-2 to 2) | | |
| | `presence_penalty` | number | ❌ | 0.0 | Presence penalty (-2 to 2) | | |
| | `stream` | boolean | ❌ | false | Enable streaming | | |
| | `stop` | string/array | ❌ | null | Stop sequences | | |
| | `tools` | array | ❌ | [] | Available tools | | |
| | `tool_choice` | string | ❌ | "auto" | Tool selection strategy | | |
| | `user` | string | ❌ | - | User identifier | | |
| #### Response (Non-Streaming) | |
| ```json | |
| { | |
| "id": "chatcmpl-1234567890", | |
| "object": "chat.completion", | |
| "created": 1234567890, | |
| "model": "qwen/qwen2.5-coder-32b", | |
| "choices": [ | |
| { | |
| "index": 0, | |
| "message": { | |
| "role": "assistant", | |
| "content": "def fibonacci(n):\n if n <= 1:\n return n\n return fibonacci(n-1) + fibonacci(n-2)" | |
| }, | |
| "finish_reason": "stop" | |
| } | |
| ], | |
| "usage": { | |
| "prompt_tokens": 25, | |
| "completion_tokens": 35, | |
| "total_tokens": 60 | |
| }, | |
| "system_fingerprint": "fp_1234567890" | |
| } | |
| ``` | |
| #### Streaming Response Format | |
| ```bash | |
| curl -X POST http://localhost:3000/v1/chat/completions \ | |
| -H "Authorization: Bearer YOUR_API_KEY" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"model": "qwen/qwen2.5-coder-32b", "messages": [{"role": "user", "content": "Hello"}], "stream": true}' | |
| ``` | |
| ```json | |
| {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1234567890,"model":"qwen/qwen2.5-coder-32b","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]} | |
| {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1234567890,"model":"qwen/qwen2.5-coder-32b","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]} | |
| {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1234567890,"model":"qwen/qwen2.5-coder-32b","choices":[{"index":0,"delta":{"content":" How"},"finish_reason":null}]} | |
| {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1234567890,"model":"qwen/qwen2.5-coder-32b","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]} | |
| ``` | |
| --- | |
| ### Models List | |
| **Endpoint:** `GET /models` | |
| List available models. | |
| #### Response | |
| ```json | |
| { | |
| "object": "list", | |
| "data": [ | |
| { | |
| "id": "qwen/qwen2.5-coder-32b", | |
| "object": "model", | |
| "created": 1234567890, | |
| "owned_by": "openclaw", | |
| "permission": [], | |
| "root": "qwen/qwen2.5-coder-32b", | |
| "parent": null, | |
| "context_window": 131072, | |
| "Capabilities": { | |
| "streaming": true, | |
| "tools": true, | |
| "voice": true | |
| } | |
| }, | |
| { | |
| "id": "qwen/qwen2.5-coder-14b", | |
| "object": "model", | |
| "created": 1234567890, | |
| "owned_by": "openclaw", | |
| "permission": [], | |
| "root": "qwen/qwen2.5-coder-14b", | |
| "parent": null, | |
| "context_window": 16384, | |
| "capabilities": { | |
| "streaming": true, | |
| "tools": true, | |
| "voice": false | |
| } | |
| } | |
| ] | |
| } | |
| ``` | |
| ### Get Model | |
| **Endpoint:** `GET /models/{model_id}` | |
| Get details about a specific model. | |
| ```bash | |
| curl http://localhost:3000/v1/models/qwen/qwen2.5-coder-32b | |
| ``` | |
| ```json | |
| { | |
| "id": "qwen/qwen2.5-coder-32b", | |
| "object": "model", | |
| "created": 1234567890, | |
| "owned_by": "openclaw", | |
| "context_window": 131072, | |
| "capabilities": { | |
| "streaming": true, | |
| "tools": true, | |
| "voice": true | |
| } | |
| } | |
| ``` | |
| --- | |
| ## WebSocket Streaming | |
| ### Connection | |
| ```javascript | |
| const ws = new WebSocket('wss://api.stack2.9.openclaw.org/v1/ws/chat'); | |
| ws.onopen = () => { | |
| ws.send(JSON.stringify({ | |
| type: 'start', | |
| model: 'qwen/qwen2.5-coder-32b', | |
| messages: [{role: 'user', content: 'Hello'}], | |
| temperature: 0.7 | |
| })); | |
| }; | |
| ws.onmessage = (event) => { | |
| const data = JSON.parse(event.data); | |
| console.log(data.content); // Streamed content | |
| }; | |
| ws.onerror = (error) => { | |
| console.error('WebSocket error:', error); | |
| }; | |
| ws.onclose = () => { | |
| console.log('Connection closed'); | |
| }; | |
| ``` | |
| ### WebSocket Message Types | |
| #### Client → Server | |
| | Type | Description | | |
| |------|-------------| | |
| | `start` | Start a new chat session | | |
| | `stop` | Stop current generation | | |
| | `ping` | Keep-alive ping | | |
| #### Server → Client | |
| | Type | Description | | |
| |------|-------------| | |
| | `content` | Streamed content chunk | | |
| | `tool_call` | Tool invocation request | | |
| | `tool_result` | Tool execution result | | |
| | `done` | Generation complete | | |
| | `error` | Error occurred | | |
| | `pong` | Keep-alive response | | |
| ### Full WebSocket Example | |
| ```javascript | |
| class Stack29WebSocket { | |
| constructor(apiKey, model = 'qwen/qwen2.5-coder-32b') { | |
| this.apiKey = apiKey; | |
| this.model = model; | |
| this.ws = null; | |
| } | |
| connect() { | |
| return new Promise((resolve, reject) => { | |
| this.ws = new WebSocket('wss://api.stack2.9.openclaw.org/v1/ws/chat'); | |
| this.ws.onopen = () => resolve(); | |
| this.ws.onerror = (e) => reject(e); | |
| }); | |
| } | |
| async sendMessage(messages, onChunk, onComplete) { | |
| await this.connect(); | |
| this.ws.onmessage = (event) => { | |
| const data = JSON.parse(event.data); | |
| if (data.type === 'content') { | |
| onChunk(data.content); | |
| } else if (data.type === 'done') { | |
| onComplete(data); | |
| this.ws.close(); | |
| } else if (data.type === 'error') { | |
| console.error('Error:', data.message); | |
| } | |
| }; | |
| this.ws.send(JSON.stringify({ | |
| type: 'start', | |
| model: this.model, | |
| messages: messages, | |
| temperature: 0.7 | |
| })); | |
| } | |
| } | |
| // Usage | |
| const client = new Stack29WebSocket('your-api-key'); | |
| client.sendMessage( | |
| [{role: 'user', content: 'Write a hello world function'}], | |
| (chunk) => process.stdout.write(chunk), | |
| (final) => console.log('\n\nDone!') | |
| ); | |
| ``` | |
| --- | |
| ## Request/Response Formats | |
| ### Message Format | |
| ```json | |
| { | |
| "role": "user|assistant|system", | |
| "content": "Message content", | |
| "name": "optional-name", | |
| "tool_calls": [], | |
| "tool_call_id": "optional-id" | |
| } | |
| ``` | |
| ### Tool Call Format | |
| ```json | |
| { | |
| "type": "function", | |
| "id": "call_abc123", | |
| "function": { | |
| "name": "execute_code", | |
| "description": "Execute code in a sandboxed environment", | |
| "parameters": { | |
| "type": "object", | |
| "properties": { | |
| "code": { | |
| "type": "string", | |
| "description": "The code to execute" | |
| }, | |
| "language": { | |
| "type": "string", | |
| "description": "Programming language" | |
| } | |
| }, | |
| "required": ["code", "language"] | |
| } | |
| } | |
| } | |
| ``` | |
| ### Tool Call Response Format | |
| ```json | |
| { | |
| "tool_call_id": "call_abc123", | |
| "output": "Hello, World!", | |
| "error": null, | |
| "execution_time_ms": 150 | |
| } | |
| ``` | |
| ### Vision Support (Image Input) | |
| ```json | |
| { | |
| "role": "user", | |
| "content": [ | |
| { | |
| "type": "text", | |
| "text": "What does this code do?" | |
| }, | |
| { | |
| "type": "image_url", | |
| "image_url": { | |
| "url": "https://example.com/screenshot.png", | |
| "detail": "low" | |
| } | |
| } | |
| ] | |
| } | |
| ``` | |
| --- | |
| ## Error Handling | |
| ### Error Response Format | |
| ```json | |
| { | |
| "error": { | |
| "message": "Invalid API key", | |
| "type": "invalid_request_error", | |
| "param": "authorization", | |
| "code": 401 | |
| } | |
| } | |
| ``` | |
| ### Error Codes | |
| | Code | Type | HTTP Status | Description | | |
| |------|------|-------------|-------------| | |
| | `invalid_api_key` | Authentication | 401 | API key is invalid | | |
| | `account_disabled` | Authentication | 403 | Account disabled | | |
| | `rate_limit_exceeded` | Rate Limit | 429 | Too many requests | | |
| | `context_length_exceeded` | Invalid Request | 400 | Context too long | | |
| | `invalid_request` | Invalid Request | 400 | Malformed request | | |
| | `model_not_found` | Invalid Request | 404 | Model doesn't exist | | |
| | `tool_error` | Tool Error | 422 | Tool execution failed | | |
| | `internal_error` | Server Error | 500 | Server-side error | | |
| | `service_unavailable` | Server Error | 503 | Service temporarily down | | |
| ### Handling Errors in Code | |
| ```python | |
| import openai | |
| from openai import APIError, RateLimitError | |
| client = openai.OpenAI(api_key="your-api-key") | |
| try: | |
| response = client.chat.completions.create( | |
| model="qwen/qwen2.5-coder-32b", | |
| messages=[{"role": "user", "content": "Hello"}] | |
| ) | |
| except RateLimitError: | |
| print("Rate limit exceeded. Please wait.") | |
| # Implement backoff logic | |
| except APIError as e: | |
| print(f"API error: {e}") | |
| # Handle error appropriately | |
| ``` | |
| ```javascript | |
| try { | |
| const response = await client.chat.completions.create({ | |
| model: 'qwen/qwen2.5-coder-32b', | |
| messages: [{role: 'user', content: 'Hello'}] | |
| }); | |
| } catch (error) { | |
| if (error.status === 429) { | |
| console.log('Rate limit exceeded'); | |
| } else if (error.status === 401) { | |
| console.log('Invalid API key'); | |
| } else { | |
| console.error('API error:', error.message); | |
| } | |
| } | |
| ``` | |
| --- | |
| ## SDKs and Examples | |
| ### Python SDK | |
| ```bash | |
| pip install openai | |
| ``` | |
| ```python | |
| from openai import OpenAI | |
| client = OpenAI( | |
| api_key="your-api-key", | |
| base_url="https://api.stack2.9.openclaw.org/v1" | |
| ) | |
| # Non-streaming | |
| response = client.chat.completions.create( | |
| model="qwen/qwen2.5-coder-32b", | |
| messages=[ | |
| {"role": "system", "content": "You are a coding assistant."}, | |
| {"role": "user", "content": "Write a Python class for a stack data structure."} | |
| ], | |
| temperature=0.7, | |
| max_tokens=1000 | |
| ) | |
| print(response.choices[0].message.content) | |
| # Streaming | |
| stream = client.chat.completions.create( | |
| model="qwen/qwen2.5-coder-32b", | |
| messages=[{"role": "user", "content": "Explain recursion"}], | |
| stream=True | |
| ) | |
| for chunk in stream: | |
| if chunk.choices[0].delta.content: | |
| print(chunk.choices[0].delta.content, end="", flush=True) | |
| ``` | |
| ### JavaScript/Node.js SDK | |
| ```bash | |
| npm install openai | |
| ``` | |
| ```javascript | |
| import OpenAI from 'openai'; | |
| const client = new OpenAI({ | |
| apiKey: 'your-api-key', | |
| baseURL: 'https://api.stack2.9.openclaw.org/v1' | |
| }); | |
| // Non-streaming | |
| const response = await client.chat.completions.create({ | |
| model: 'qwen/qwen2.5-coder-32b', | |
| messages: [ | |
| {role: 'system', content: 'You are a coding assistant.'}, | |
| {role: 'user', content: 'Write a Python class for a stack data structure.'} | |
| ], | |
| temperature: 0.7, | |
| max_tokens: 1000 | |
| }); | |
| console.log(response.choices[0].message.content); | |
| // Streaming | |
| const stream = await client.chat.completions.create({ | |
| model: 'qwen/qwen2.5-coder-32b', | |
| messages: [{role: 'user', content: 'Explain recursion'}], | |
| stream: true | |
| }); | |
| for await (const chunk of stream) { | |
| process.stdout.write(chunk.choices[0].delta.content || ''); | |
| } | |
| ``` | |
| ### cURL Examples | |
| ```bash | |
| # Basic chat completion | |
| curl -X POST https://api.stack2.9.openclaw.org/v1/chat/completions \ | |
| -H "Authorization: Bearer YOUR_API_KEY" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "model": "qwen/qwen2.5-coder-32b", | |
| "messages": [{"role": "user", "content": "Hello"}] | |
| }' | |
| # Streaming completion | |
| curl -X POST https://api.stack2.9.openclaw.org/v1/chat/completions \ | |
| -H "Authorization: Bearer YOUR_API_KEY" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"model": "qwen/qwen2.5-coder-32b", "messages": [{"role": "user", "content": "Hello"}], "stream": true}' | |
| # With system prompt | |
| curl -X POST https://api.stack2.9.openclaw.org/v1/chat/completions \ | |
| -H "Authorization: Bearer YOUR_API_KEY" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "model": "qwen/qwen2.5-coder-32b", | |
| "messages": [ | |
| {"role": "system", "content": "You are an expert Python programmer."}, | |
| {"role": "user", "content": "Write a decorator that caches function results."} | |
| ] | |
| }' | |
| # With tools | |
| curl -X POST https://api.stack2.9.openclaw.org/v1/chat/completions \ | |
| -H "Authorization: Bearer YOUR_API_KEY" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "model": "qwen/qwen2.5-coder-32b", | |
| "messages": [{"role": "user", "content": "Create a file called hello.py"}], | |
| "tools": [ | |
| { | |
| "type": "function", | |
| "function": { | |
| "name": "write_file", | |
| "description": "Write content to a file", | |
| "parameters": { | |
| "type": "object", | |
| "properties": { | |
| "path": {"type": "string"}, | |
| "content": {"type": "string"} | |
| } | |
| } | |
| } | |
| } | |
| ] | |
| }' | |
| ``` | |
| ### OpenAI-Compatible Client Usage | |
| Stack 2.9 is compatible with the OpenAI client library: | |
| ```python | |
| # Works with LangChain | |
| from langchain.chat_models import ChatOpenAI | |
| from langchain.schema import HumanMessage | |
| chat = ChatOpenAI( | |
| model="qwen/qwen2.5-coder-32b", | |
| openai_api_base="https://api.stack2.9.openclaw.org/v1", | |
| openai_api_key="your-api-key" | |
| ) | |
| response = chat([HumanMessage(content="Hello!")]) | |
| ``` | |
| --- | |
| ## Webhooks | |
| Configure webhooks for asynchronous events: | |
| ```json | |
| { | |
| "webhook_url": "https://your-server.com/webhook", | |
| "events": [ | |
| "tool_call.started", | |
| "tool_call.completed", | |
| "tool_call.failed", | |
| "generation.done" | |
| ] | |
| } | |
| ``` | |
| ### Webhook Payload | |
| ```json | |
| { | |
| "event": "tool_call.completed", | |
| "timestamp": "2026-04-01T12:00:00Z", | |
| "data": { | |
| "tool_call_id": "call_abc123", | |
| "tool_name": "execute_code", | |
| "execution_time_ms": 150, | |
| "result": "Hello, World!" | |
| } | |
| } | |
| ``` | |
| --- | |
| ## Best Practices | |
| ### 1. Use Streaming for Better UX | |
| Always use streaming for long-form content to provide real-time feedback to users. | |
| ### 2. Implement Proper Error Handling | |
| Always handle rate limits and authentication errors gracefully with exponential backoff. | |
| ### 3. Cache Responses | |
| Cache frequent queries to reduce API calls and improve response times. | |
| ```python | |
| from functools import lru_cache | |
| @lru_cache(maxsize=100) | |
| def cached_completion(prompt_hash): | |
| # Implement caching logic | |
| pass | |
| ``` | |
| ### 4. Use Appropriate Temperature | |
| | Task | Temperature | | |
| |------|-------------| | |
| | Code generation | 0.0 - 0.3 | | |
| | Factual Q&A | 0.0 - 0.2 | | |
| | Creative writing | 0.7 - 1.0 | | |
| | brainstorming | 0.8 - 1.2 | | |
| ### 5. Monitor Token Usage | |
| Track usage to stay within rate limits: | |
| ```python | |
| response = client.chat.completions.create(...) | |
| print(f"Tokens used: {response.usage.total_tokens}") | |
| ``` | |
| --- | |
| ## Support | |
| - **Documentation**: [docs/index.html](docs/index.html) | |
| - **API Status**: [status.stack2.9.openclaw.org](https://status.stack2.9.openclaw.org) | |
| - **Issues**: [GitHub Issues](https://github.com/openclaw/stack-2.9/issues) | |
| - **Email**: api@stack2.9.openclaw.org | |
| --- | |
| **API Version**: 1.0 | |
| **Last Updated**: 2026-04-01 | |
| **Status**: Active | |