Spaces:

Rox-Turbo
/

API

Running

App Files Files Community

Rox-Turbo commited on Mar 11

Commit

8816398

verified ·

1 Parent(s): 26b4c91

Delete AGENT_USAGE.md

Browse files

Files changed (1) hide show

AGENT_USAGE.md +0 -177

AGENT_USAGE.md DELETED Viewed

@@ -1,177 +0,0 @@
-## AI Agent Integration Guide
-This document explains how any AI agent (or app) should use the NVIDIA proxy API exposed by this service. The proxy keeps the real `NVIDIA_API_KEY` on the backend so that the agent never needs to handle or see the key directly.
----
-### 1. Base URL
-- **Local development**: `http://localhost:8000`
-- **Hugging Face Space** (production): `https://Rox-Turbo-API.hf.space`
-All endpoints below are relative to this base URL.
----
-### 2. Chat endpoint (recommended for applications)
-- **HTTP method**: `POST`
-- **Path**: `/chat`
-- **Description**: General chat/completions endpoint, similar to OpenAI Chat Completions.
-**Request JSON (with optional system prompt):**
-```json
-{
-  "messages": [
-    {
-      "role": "system",
-      "content": "You are a helpful assistant that answers briefly."
-    },
-    {
-      "role": "user",
-      "content": "Your question here"
-    }
-  ],
-  "temperature": 1.0,
-  "top_p": 1.0,
-  "max_tokens": 1024
-}
-```
-- `messages`:
-  - Array of objects with `role` (`"user"`, `"assistant"`, `"system"`) and `content` (string).
-  - Include one or more **`system` messages at the start** to control behavior (system prompting).
-  - Append `user` and `assistant` messages to maintain conversation history.
-- `temperature`, `top_p`, `max_tokens` are optional; if omitted, defaults are used.
-**Response JSON:**
-```json
-{
-  "content": "Model reply text..."
-}
-```
-- `content`: the full generated reply from the model as a single string.
-**Notes for agents:**
-- No API key or auth header is required; the proxy handles credentials.
-- Handle HTTP error codes:
-  - `400–499`: client-side issues (invalid body, etc.).
-  - `500`: internal error talking to upstream.
-  - `502`: bad response from upstream provider.
----
-### 3. Hugging Face–style endpoint (for HF tools)
-- **HTTP method**: `POST`
-- **Path**: `/hf/generate`
-- **Description**: Hugging Face text-generation–style interface (`inputs` + `parameters`).
-**Request JSON:**
-```json
-{
-  "inputs": "Prompt text here",
-  "parameters": {
-    "temperature": 0.7,
-    "top_p": 0.95,
-    "max_new_tokens": 256
-  }
-}
-```
-- `parameters` is optional; unspecified values fall back to sensible defaults.
-**Response JSON:**
-```json
-[
-  {
-    "generated_text": "Model reply text..."
-  }
-]
-```
-- This shape matches what many Hugging Face clients expect from a text-generation endpoint.
----
-### 4. Example usage from a browser (static frontend)
-**JavaScript (fetch) example using `/chat`:**
-```js
-const API_URL = "https://Rox-Turbo-API.hf.space/chat";
-async function sendMessage(messageText) {
-  const body = {
-    messages: [{ role: "user", content: messageText }],
-    temperature: 1,
-    top_p: 1,
-    max_tokens: 1024
-  };
-  const res = await fetch(API_URL, {
-    method: "POST",
-    headers: { "Content-Type": "application/json" },
-    body: JSON.stringify(body)
-  });
-  if (!res.ok) {
-    throw new Error(`API error: ${res.status} ${await res.text()}`);
-  }
-  const data = await res.json();
-  return data.content; // model reply text
-}
-```
----
-### 5. Example usage from a Python agent
-**Python example using `requests` and `/chat`:**
-```python
-import requests
-BASE_URL = "https://Rox-Turbo-API.hf.space"
-def ask_model(messages, temperature=1.0, top_p=1.0, max_tokens=1024):
-  url = f"{BASE_URL}/chat"
-  payload = {
-      "messages": messages,
-      "temperature": temperature,
-      "top_p": top_p,
-      "max_tokens": max_tokens,
-  }
-  response = requests.post(url, json=payload, timeout=60)
-  response.raise_for_status()
-  data = response.json()
-  return data["content"]
-if __name__ == "__main__":
-  reply = ask_model([{"role": "user", "content": "Hello, who are you?"}])
-  print("Model reply:", reply)
-```
----
-### 6. Responsibilities and guarantees
-- The proxy:
-  - Maintains and protects the `NVIDIA_API_KEY` on the server side.
-  - Handles communication with the NVIDIA OpenAI-compatible endpoint.
-  - Normalizes responses into simple JSON formats (`content` and `generated_text`).
-- The agent:
-  - Only needs HTTPS access to the proxy.
-  - Must handle standard HTTP errors and implement retries or fallback behavior as needed.
-Use this document as the single source of truth for how your AI agent should call the NVIDIA proxy API in this project.