Rox-Turbo commited on
Commit
eb8b637
·
verified ·
1 Parent(s): 1e057a5

Create AGENT_USAGE.md

Browse files
Files changed (1) hide show
  1. AGENT_USAGE.md +169 -0
AGENT_USAGE.md ADDED
@@ -0,0 +1,169 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## AI Agent Integration Guide
2
+
3
+ This document explains how any AI agent (or app) should use the NVIDIA proxy API exposed by this service. The proxy keeps the real `NVIDIA_API_KEY` on the backend so that the agent never needs to handle or see the key directly.
4
+
5
+ ---
6
+
7
+ ### 1. Base URL
8
+
9
+ - **Local development**: `http://localhost:8000`
10
+ - **Hugging Face Space** (production): `https://Rox-Turbo-API.hf.space`
11
+
12
+ All endpoints below are relative to this base URL.
13
+
14
+ ---
15
+
16
+ ### 2. Chat endpoint (recommended for applications)
17
+
18
+ - **HTTP method**: `POST`
19
+ - **Path**: `/chat`
20
+ - **Description**: General chat/completions endpoint, similar to OpenAI Chat Completions.
21
+
22
+ **Request JSON:**
23
+
24
+ ```json
25
+ {
26
+ "messages": [
27
+ { "role": "user", "content": "Your question here" }
28
+ ],
29
+ "temperature": 1.0,
30
+ "top_p": 1.0,
31
+ "max_tokens": 1024
32
+ }
33
+ ```
34
+
35
+ - `messages`:
36
+ - Array of objects with `role` (`"user"`, `"assistant"`, `"system"`) and `content` (string).
37
+ - The agent should include conversation history if it wants the model to be aware of context.
38
+ - `temperature`, `top_p`, `max_tokens` are optional; if omitted, defaults are used.
39
+
40
+ **Response JSON:**
41
+
42
+ ```json
43
+ {
44
+ "content": "Model reply text..."
45
+ }
46
+ ```
47
+
48
+ - `content`: the full generated reply from the model as a single string.
49
+
50
+ **Notes for agents:**
51
+
52
+ - No API key or auth header is required; the proxy handles credentials.
53
+ - Handle HTTP error codes:
54
+ - `400–499`: client-side issues (invalid body, etc.).
55
+ - `500`: internal error talking to upstream.
56
+ - `502`: bad response from upstream provider.
57
+
58
+ ---
59
+
60
+ ### 3. Hugging Face–style endpoint (for HF tools)
61
+
62
+ - **HTTP method**: `POST`
63
+ - **Path**: `/hf/generate`
64
+ - **Description**: Hugging Face text-generation–style interface (`inputs` + `parameters`).
65
+
66
+ **Request JSON:**
67
+
68
+ ```json
69
+ {
70
+ "inputs": "Prompt text here",
71
+ "parameters": {
72
+ "temperature": 0.7,
73
+ "top_p": 0.95,
74
+ "max_new_tokens": 256
75
+ }
76
+ }
77
+ ```
78
+
79
+ - `parameters` is optional; unspecified values fall back to sensible defaults.
80
+
81
+ **Response JSON:**
82
+
83
+ ```json
84
+ [
85
+ {
86
+ "generated_text": "Model reply text..."
87
+ }
88
+ ]
89
+ ```
90
+
91
+ - This shape matches what many Hugging Face clients expect from a text-generation endpoint.
92
+
93
+ ---
94
+
95
+ ### 4. Example usage from a browser (static frontend)
96
+
97
+ **JavaScript (fetch) example using `/chat`:**
98
+
99
+ ```js
100
+ const API_URL = "https://Rox-Turbo-API.hf.space/chat";
101
+
102
+ async function sendMessage(messageText) {
103
+ const body = {
104
+ messages: [{ role: "user", content: messageText }],
105
+ temperature: 1,
106
+ top_p: 1,
107
+ max_tokens: 1024
108
+ };
109
+
110
+ const res = await fetch(API_URL, {
111
+ method: "POST",
112
+ headers: { "Content-Type": "application/json" },
113
+ body: JSON.stringify(body)
114
+ });
115
+
116
+ if (!res.ok) {
117
+ throw new Error(`API error: ${res.status} ${await res.text()}`);
118
+ }
119
+
120
+ const data = await res.json();
121
+ return data.content; // model reply text
122
+ }
123
+ ```
124
+
125
+ ---
126
+
127
+ ### 5. Example usage from a Python agent
128
+
129
+ **Python example using `requests` and `/chat`:**
130
+
131
+ ```python
132
+ import requests
133
+
134
+ BASE_URL = "https://Rox-Turbo-API.hf.space"
135
+
136
+ def ask_model(messages, temperature=1.0, top_p=1.0, max_tokens=1024):
137
+ url = f"{BASE_URL}/chat"
138
+ payload = {
139
+ "messages": messages,
140
+ "temperature": temperature,
141
+ "top_p": top_p,
142
+ "max_tokens": max_tokens,
143
+ }
144
+
145
+ response = requests.post(url, json=payload, timeout=60)
146
+ response.raise_for_status()
147
+ data = response.json()
148
+ return data["content"]
149
+
150
+
151
+ if __name__ == "__main__":
152
+ reply = ask_model([{"role": "user", "content": "Hello, who are you?"}])
153
+ print("Model reply:", reply)
154
+ ```
155
+
156
+ ---
157
+
158
+ ### 6. Responsibilities and guarantees
159
+
160
+ - The proxy:
161
+ - Maintains and protects the `NVIDIA_API_KEY` on the server side.
162
+ - Handles communication with the NVIDIA OpenAI-compatible endpoint.
163
+ - Normalizes responses into simple JSON formats (`content` and `generated_text`).
164
+ - The agent:
165
+ - Only needs HTTPS access to the proxy.
166
+ - Must handle standard HTTP errors and implement retries or fallback behavior as needed.
167
+
168
+ Use this document as the single source of truth for how your AI agent should call the NVIDIA proxy API in this project.
169
+