Spaces:

Rox-Turbo
/

API

Running

App Files Files Community

Rox-Turbo commited on Mar 10

Commit

eb8b637

verified ·

1 Parent(s): 1e057a5

Create AGENT_USAGE.md

Browse files

Files changed (1) hide show

AGENT_USAGE.md +169 -0

AGENT_USAGE.md ADDED Viewed

	@@ -0,0 +1,169 @@

+## AI Agent Integration Guide
+This document explains how any AI agent (or app) should use the NVIDIA proxy API exposed by this service. The proxy keeps the real `NVIDIA_API_KEY` on the backend so that the agent never needs to handle or see the key directly.
+---
+### 1. Base URL
+- **Local development**: `http://localhost:8000`
+- **Hugging Face Space** (production): `https://Rox-Turbo-API.hf.space`
+All endpoints below are relative to this base URL.
+---
+### 2. Chat endpoint (recommended for applications)
+- **HTTP method**: `POST`
+- **Path**: `/chat`
+- **Description**: General chat/completions endpoint, similar to OpenAI Chat Completions.
+**Request JSON:**
+```json
+{
+  "messages": [
+    { "role": "user", "content": "Your question here" }
+  ],
+  "temperature": 1.0,
+  "top_p": 1.0,
+  "max_tokens": 1024
+}
+```
+- `messages`:
+  - Array of objects with `role` (`"user"`, `"assistant"`, `"system"`) and `content` (string).
+  - The agent should include conversation history if it wants the model to be aware of context.
+- `temperature`, `top_p`, `max_tokens` are optional; if omitted, defaults are used.
+**Response JSON:**
+```json
+{
+  "content": "Model reply text..."
+}
+```
+- `content`: the full generated reply from the model as a single string.
+**Notes for agents:**
+- No API key or auth header is required; the proxy handles credentials.
+- Handle HTTP error codes:
+  - `400–499`: client-side issues (invalid body, etc.).
+  - `500`: internal error talking to upstream.
+  - `502`: bad response from upstream provider.
+---
+### 3. Hugging Face–style endpoint (for HF tools)
+- **HTTP method**: `POST`
+- **Path**: `/hf/generate`
+- **Description**: Hugging Face text-generation–style interface (`inputs` + `parameters`).
+**Request JSON:**
+```json
+{
+  "inputs": "Prompt text here",
+  "parameters": {
+    "temperature": 0.7,
+    "top_p": 0.95,
+    "max_new_tokens": 256
+  }
+}
+```
+- `parameters` is optional; unspecified values fall back to sensible defaults.
+**Response JSON:**
+```json
+[
+  {
+    "generated_text": "Model reply text..."
+  }
+]
+```
+- This shape matches what many Hugging Face clients expect from a text-generation endpoint.
+---
+### 4. Example usage from a browser (static frontend)
+**JavaScript (fetch) example using `/chat`:**
+```js
+const API_URL = "https://Rox-Turbo-API.hf.space/chat";
+async function sendMessage(messageText) {
+  const body = {
+    messages: [{ role: "user", content: messageText }],
+    temperature: 1,
+    top_p: 1,
+    max_tokens: 1024
+  };
+  const res = await fetch(API_URL, {
+    method: "POST",
+    headers: { "Content-Type": "application/json" },
+    body: JSON.stringify(body)
+  });
+  if (!res.ok) {
+    throw new Error(`API error: ${res.status} ${await res.text()}`);
+  }
+  const data = await res.json();
+  return data.content; // model reply text
+}
+```
+---
+### 5. Example usage from a Python agent
+**Python example using `requests` and `/chat`:**
+```python
+import requests
+BASE_URL = "https://Rox-Turbo-API.hf.space"
+def ask_model(messages, temperature=1.0, top_p=1.0, max_tokens=1024):
+  url = f"{BASE_URL}/chat"
+  payload = {
+      "messages": messages,
+      "temperature": temperature,
+      "top_p": top_p,
+      "max_tokens": max_tokens,
+  }
+  response = requests.post(url, json=payload, timeout=60)
+  response.raise_for_status()
+  data = response.json()
+  return data["content"]
+if __name__ == "__main__":
+  reply = ask_model([{"role": "user", "content": "Hello, who are you?"}])
+  print("Model reply:", reply)
+```
+---
+### 6. Responsibilities and guarantees
+- The proxy:
+  - Maintains and protects the `NVIDIA_API_KEY` on the server side.
+  - Handles communication with the NVIDIA OpenAI-compatible endpoint.
+  - Normalizes responses into simple JSON formats (`content` and `generated_text`).
+- The agent:
+  - Only needs HTTPS access to the proxy.
+  - Must handle standard HTTP errors and implement retries or fallback behavior as needed.
+Use this document as the single source of truth for how your AI agent should call the NVIDIA proxy API in this project.