Update AMD inference endpoint and token to 165.245.137.80
Browse files
README.md
CHANGED
|
@@ -26,8 +26,8 @@ tags:
|
|
| 26 |
### ⚡ Live Status (Hackathon Mode)
|
| 27 |
- **Primary Inference**: AMD Instinct MI300X (192GB VRAM)
|
| 28 |
- **Backend**: FastAPI + vLLM on ROCm
|
| 29 |
-
- **Current Server**: `165.245.143.46` (vLLM via Token Auth)
|
| 30 |
- **Status**: ✅ **ONLINE** (Live Inference Active)
|
|
|
|
| 31 |
|
| 32 |
> **AMD + lablab.ai Hackathon** — Track 2 (AMD Developer Cloud) · Track 1 (AI Agents) · Track 3 (Vision & Multimodal AI)
|
| 33 |
|
|
|
|
| 26 |
### ⚡ Live Status (Hackathon Mode)
|
| 27 |
- **Primary Inference**: AMD Instinct MI300X (192GB VRAM)
|
| 28 |
- **Backend**: FastAPI + vLLM on ROCm
|
|
|
|
| 29 |
- **Status**: ✅ **ONLINE** (Live Inference Active)
|
| 30 |
+
- **Current Server**: `165.245.137.80` (vLLM via Token Auth)
|
| 31 |
|
| 32 |
> **AMD + lablab.ai Hackathon** — Track 2 (AMD Developer Cloud) · Track 1 (AI Agents) · Track 3 (Vision & Multimodal AI)
|
| 33 |
|
agents.py
CHANGED
|
@@ -19,13 +19,13 @@ import httpx # async HTTP — lightweight, no extra deps beyond requirements
|
|
| 19 |
# Or use the Jupyter proxy route: http://165.245.143.46/proxy/8000
|
| 20 |
AMD_INFERENCE_URL = os.environ.get(
|
| 21 |
"AMD_INFERENCE_URL",
|
| 22 |
-
"http://165.245.
|
| 23 |
).rstrip("/")
|
| 24 |
|
| 25 |
# Token for the AMD inference server (if required)
|
| 26 |
AMD_INFERENCE_TOKEN = os.environ.get(
|
| 27 |
"AMD_INFERENCE_TOKEN",
|
| 28 |
-
"
|
| 29 |
)
|
| 30 |
|
| 31 |
# The model name vLLM is serving (used in the chat/completions request).
|
|
|
|
| 19 |
# Or use the Jupyter proxy route: http://165.245.143.46/proxy/8000
|
| 20 |
AMD_INFERENCE_URL = os.environ.get(
|
| 21 |
"AMD_INFERENCE_URL",
|
| 22 |
+
"http://165.245.137.80"
|
| 23 |
).rstrip("/")
|
| 24 |
|
| 25 |
# Token for the AMD inference server (if required)
|
| 26 |
AMD_INFERENCE_TOKEN = os.environ.get(
|
| 27 |
"AMD_INFERENCE_TOKEN",
|
| 28 |
+
"DiPipPSZoxb96rcrP7X+B0N5mTTEzxU/ziesgI/Z2NPo9xPKM"
|
| 29 |
)
|
| 30 |
|
| 31 |
# The model name vLLM is serving (used in the chat/completions request).
|