Rox-Turbo commited on
Commit
8816398
·
verified ·
1 Parent(s): 26b4c91

Delete AGENT_USAGE.md

Browse files
Files changed (1) hide show
  1. AGENT_USAGE.md +0 -177
AGENT_USAGE.md DELETED
@@ -1,177 +0,0 @@
1
- ## AI Agent Integration Guide
2
-
3
- This document explains how any AI agent (or app) should use the NVIDIA proxy API exposed by this service. The proxy keeps the real `NVIDIA_API_KEY` on the backend so that the agent never needs to handle or see the key directly.
4
-
5
- ---
6
-
7
- ### 1. Base URL
8
-
9
- - **Local development**: `http://localhost:8000`
10
- - **Hugging Face Space** (production): `https://Rox-Turbo-API.hf.space`
11
-
12
- All endpoints below are relative to this base URL.
13
-
14
- ---
15
-
16
- ### 2. Chat endpoint (recommended for applications)
17
-
18
- - **HTTP method**: `POST`
19
- - **Path**: `/chat`
20
- - **Description**: General chat/completions endpoint, similar to OpenAI Chat Completions.
21
-
22
- **Request JSON (with optional system prompt):**
23
-
24
- ```json
25
- {
26
- "messages": [
27
- {
28
- "role": "system",
29
- "content": "You are a helpful assistant that answers briefly."
30
- },
31
- {
32
- "role": "user",
33
- "content": "Your question here"
34
- }
35
- ],
36
- "temperature": 1.0,
37
- "top_p": 1.0,
38
- "max_tokens": 1024
39
- }
40
- ```
41
-
42
- - `messages`:
43
- - Array of objects with `role` (`"user"`, `"assistant"`, `"system"`) and `content` (string).
44
- - Include one or more **`system` messages at the start** to control behavior (system prompting).
45
- - Append `user` and `assistant` messages to maintain conversation history.
46
- - `temperature`, `top_p`, `max_tokens` are optional; if omitted, defaults are used.
47
-
48
- **Response JSON:**
49
-
50
- ```json
51
- {
52
- "content": "Model reply text..."
53
- }
54
- ```
55
-
56
- - `content`: the full generated reply from the model as a single string.
57
-
58
- **Notes for agents:**
59
-
60
- - No API key or auth header is required; the proxy handles credentials.
61
- - Handle HTTP error codes:
62
- - `400–499`: client-side issues (invalid body, etc.).
63
- - `500`: internal error talking to upstream.
64
- - `502`: bad response from upstream provider.
65
-
66
- ---
67
-
68
- ### 3. Hugging Face–style endpoint (for HF tools)
69
-
70
- - **HTTP method**: `POST`
71
- - **Path**: `/hf/generate`
72
- - **Description**: Hugging Face text-generation–style interface (`inputs` + `parameters`).
73
-
74
- **Request JSON:**
75
-
76
- ```json
77
- {
78
- "inputs": "Prompt text here",
79
- "parameters": {
80
- "temperature": 0.7,
81
- "top_p": 0.95,
82
- "max_new_tokens": 256
83
- }
84
- }
85
- ```
86
-
87
- - `parameters` is optional; unspecified values fall back to sensible defaults.
88
-
89
- **Response JSON:**
90
-
91
- ```json
92
- [
93
- {
94
- "generated_text": "Model reply text..."
95
- }
96
- ]
97
- ```
98
-
99
- - This shape matches what many Hugging Face clients expect from a text-generation endpoint.
100
-
101
- ---
102
-
103
- ### 4. Example usage from a browser (static frontend)
104
-
105
- **JavaScript (fetch) example using `/chat`:**
106
-
107
- ```js
108
- const API_URL = "https://Rox-Turbo-API.hf.space/chat";
109
-
110
- async function sendMessage(messageText) {
111
- const body = {
112
- messages: [{ role: "user", content: messageText }],
113
- temperature: 1,
114
- top_p: 1,
115
- max_tokens: 1024
116
- };
117
-
118
- const res = await fetch(API_URL, {
119
- method: "POST",
120
- headers: { "Content-Type": "application/json" },
121
- body: JSON.stringify(body)
122
- });
123
-
124
- if (!res.ok) {
125
- throw new Error(`API error: ${res.status} ${await res.text()}`);
126
- }
127
-
128
- const data = await res.json();
129
- return data.content; // model reply text
130
- }
131
- ```
132
-
133
- ---
134
-
135
- ### 5. Example usage from a Python agent
136
-
137
- **Python example using `requests` and `/chat`:**
138
-
139
- ```python
140
- import requests
141
-
142
- BASE_URL = "https://Rox-Turbo-API.hf.space"
143
-
144
- def ask_model(messages, temperature=1.0, top_p=1.0, max_tokens=1024):
145
- url = f"{BASE_URL}/chat"
146
- payload = {
147
- "messages": messages,
148
- "temperature": temperature,
149
- "top_p": top_p,
150
- "max_tokens": max_tokens,
151
- }
152
-
153
- response = requests.post(url, json=payload, timeout=60)
154
- response.raise_for_status()
155
- data = response.json()
156
- return data["content"]
157
-
158
-
159
- if __name__ == "__main__":
160
- reply = ask_model([{"role": "user", "content": "Hello, who are you?"}])
161
- print("Model reply:", reply)
162
- ```
163
-
164
- ---
165
-
166
- ### 6. Responsibilities and guarantees
167
-
168
- - The proxy:
169
- - Maintains and protects the `NVIDIA_API_KEY` on the server side.
170
- - Handles communication with the NVIDIA OpenAI-compatible endpoint.
171
- - Normalizes responses into simple JSON formats (`content` and `generated_text`).
172
- - The agent:
173
- - Only needs HTTPS access to the proxy.
174
- - Must handle standard HTTP errors and implement retries or fallback behavior as needed.
175
-
176
- Use this document as the single source of truth for how your AI agent should call the NVIDIA proxy API in this project.
177
-