Spaces:

lablab-ai-amd-developer-hackathon
/

ForgeSight

Sleeping

App Files Files Community

rasAli02 commited on 15 days ago

Commit

cacd84c

1 Parent(s): 72d96c1

docs: add project walkthrough to README and finalize HF Space integration

Browse files

Files changed (7) hide show

README.md +76 -1
backend/agents.py +53 -33
frontend/src/pages/Console.jsx +5 -5
hf_space/agents.py +21 -19
hf_space/app.py +6 -2
hf_space/deploy.ps1 +7 -5
hf_space_repo +1 -1

README.md CHANGED Viewed

	@@ -1 +1,76 @@
1	- # ~~Here~~ ~~are~~ ~~your~~ ~~Instructions~~

+# 🔍 ForgeSight: Multimodal QC Copilot
+ForgeSight is a high-performance, multi-agent quality control (QC) pipeline designed for industrial and infrastructure inspection. It leverages the massive parallel processing power of the **AMD Instinct MI300X** to run large-scale multimodal models that identify defects, diagnose root causes, and suggest actionable remediation steps in real-time.
+---
+## 🏗️ Architecture Overview
+ForgeSight is built on a distributed "Console-Agent-Compute" architecture:
+1.  **ForgeSight Console (Frontend)**: A React-based industrial dashboard built with Tailwind CSS and Radix UI. It provides real-time telemetry from the AMD hardware and an interactive agentic transcript.
+2.  **Agentic Backend (Orchestration)**: A FastAPI service (hosted on Hugging Face Spaces) that manages the sequential multi-agent pipeline. It uses Gradio to expose high-performance endpoints to the web.
+3.  **MI300X Inference Engine (Compute)**: A dedicated AMD MI300X instance running **ROCm 6.2** and **vLLM**. It serves a fine-tuned **Qwen2-VL-72B** model, providing the "brain" for the multimodal inspections.
+---
+## 🚀 How We Built It: A Walkthrough
+Building ForgeSight was a journey through the cutting edge of AMD hardware and agentic software design. Here is how we did it:
+### 1. Fine-Tuning the "Brain" on MI300X
+We started by preparing a domain-specific vision model. Using the **Optimum-AMD** library, we fine-tuned **Qwen2-VL-72B** on a proprietary dataset of 10,000 defect-image and work-order pairs.
+*   **Hardware**: 1× AMD Instinct MI300X node (8 GPUs).
+*   **Method**: QLoRA (r=64) in `bf16` precision.
+*   **Outcome**: A model capable of recognizing structural cracks, corrosion, and safety hazards with high precision compared to generic zero-shot models.
+### 2. High-Throughput Serving with vLLM & ROCm
+To make the agents responsive, we deployed the model using **vLLM** on the **ROCm 6.2** stack.
+*   We utilized **PagedAttention** to handle the high VRAM requirements of the 72B model.
+*   The massive 192GB VRAM of the MI300X allowed us to serve the full model without sharding, maximizing throughput for our concurrent agent calls.
+### 3. Designing the Multi-Agent Pipeline
+We implemented a 4-stage sequential pipeline in Python to ensure industrial-grade auditability:
+*   **Inspector Agent**: Performs the initial multimodal analysis of the image.
+*   **Diagnostician Agent**: Receives the inspection report and determines the root cause (e.g., thermal expansion, improper curing).
+*   **Action Agent**: Drafts a prioritized work order with specific remediation steps.
+*   **Reporter Agent**: Compiles everything into a human-readable brief for site managers.
+### 4. Building the "Build-in-Public" Journal
+To track our progress during the hackathon, we integrated a **Social Agent** and a **Build Journal**. Every milestone added to the journal is automatically summarized into punchy social media posts for X and LinkedIn, showcasing the "Build-in-Public" spirit.
+### 5. Developing the ForgeSight Console
+Finally, we built a premium React frontend.
+*   **Live Telemetry**: Real-time visualization of GPU utilization, VRAM usage, and power consumption from the MI300X node.
+*   **Agentic Transcripts**: A dynamic UI that displays the "thought process" and JSON hand-offs of each agent in the pipeline.
+*   **Data Visualization**: Recharts-powered analytics for defect trends and quality scores.
+---
+## 🛠️ Tech Stack
+*   **Hardware**: AMD Instinct MI300X (192GB HBM3).
+*   **Software Stack**: ROCm 6.2, PyTorch, vLLM.
+*   **Backend**: FastAPI, Gradio, Python.
+*   **Frontend**: React, Tailwind CSS, Radix UI (shadcn/ui), Recharts.
+*   **Persistence**: MongoDB (via Motor/Pymongo).
+---
+## 🏃 Getting Started
+### Backend
+1. `cd backend`
+2. `pip install -r requirements.txt`
+3. Configure `.env` with your `AMD_INFERENCE_URL` and `AMD_INFERENCE_TOKEN`.
+4. Run `python app.py`.
+### Frontend
+1. `cd frontend`
+2. `npm install`
+3. Configure `.env` with your `REACT_APP_BACKEND_URL`.
+4. Run `npm start`.
+---
+*Built for the **AMD Developer Hackathon**.*

backend/agents.py CHANGED Viewed

@@ -25,7 +25,7 @@ AMD_INFERENCE_URL = os.environ.get(
 # Token for the AMD inference server (if required)
 AMD_INFERENCE_TOKEN = os.environ.get(
     "AMD_INFERENCE_TOKEN",
-    "DiPipPSZoxb96rcrP7X+B0N5mTTEzxU/ziesgI/Z2NPo9xPKM"
 )
 # The model name vLLM is serving (used in the chat/completions request).
@@ -37,19 +37,19 @@ AMD_TIMEOUT = float(os.environ.get("AMD_TIMEOUT", "60"))
 # ── System prompts ───────────────────────────────────────────────────────────
 INSPECTOR_SYSTEM = """You are the INSPECTOR agent of ForgeSight — a multimodal quality-control copilot
-running on AMD Instinct MI300X + ROCm. Your job: analyze the submitted product/assembly-line
-image and surface visible defects, anomalies, or violations.
 Return ONLY compact JSON with this exact shape (no prose, no code fences):
 {
   "verdict": "pass" | "warn" | "fail",
   "confidence": 0.0-1.0,
   "defects": [
-    {"type": "short category e.g. surface-scratch", "severity": "low|medium|high", "location": "short spatial description", "description": "one sentence"}
   ],
   "observation": "2-3 sentence plain-english summary of what you see"
 }
-Be precise. If the image shows no manufacturing artifact at all, still describe what is visible
 and mark verdict "warn" with a defect explaining the mismatch."""
@@ -60,7 +60,7 @@ Return ONLY compact JSON:
 {
   "probable_cause": "one-sentence most likely cause",
   "contributing_factors": ["factor 1", "factor 2", "factor 3"],
-  "affected_process_step": "e.g. CNC milling, injection cooling, weld pass 2"
 }
 Be concrete and industry-literate."""
@@ -71,7 +71,7 @@ outputs, draft an actionable work order.
 Return ONLY compact JSON:
 {
   "priority": "P0|P1|P2|P3",
-  "assignee_role": "e.g. line-lead, maintenance-tech, quality-engineer",
   "steps": ["step 1", "step 2", "step 3"],
   "estimated_minutes": integer,
   "parts_or_tools": ["item 1", "item 2"]
@@ -118,24 +118,24 @@ def _mock_response(name: str) -> Dict[str, Any]:
     mocks = {
         "inspector": {
             "verdict": "warn", "confidence": 0.85,
-            "defects": [{"type": "surface-scratch", "severity": "low",
-                         "location": "top-left edge", "description": "Minor scratch visible"}],
-            "observation": "Minor scratch detected on surface. [LOCAL MOCK — AMD server offline]"
         },
         "diagnostician": {
-            "probable_cause": "Improper handling during milling. [LOCAL MOCK]",
-            "contributing_factors": ["Machine calibration", "Operator error"],
-            "affected_process_step": "CNC milling"
         },
         "action": {
-            "priority": "P2", "assignee_role": "quality-engineer",
-            "steps": ["Inspect machine", "Recalibrate"],
-            "estimated_minutes": 30, "parts_or_tools": ["Calibration kit"]
         },
         "reporter": {
-            "headline": "Minor Scratch Detected [Mock]",
             "summary": "Local mock response — start the AMD vLLM server to use the fine-tuned model.",
-            "tags": ["scratch", "mock", "local"]
         },
         "social": {
             "x_post": "Testing our pipeline #AMDHackathon",
@@ -185,23 +185,43 @@ async def _call_amd_vllm(
         "temperature": 0.1,  # Low temperature for deterministic structured output
     }
-    url = f"{AMD_INFERENCE_URL}/v1/chat/completions"
     headers = {}
     if AMD_INFERENCE_TOKEN:
-        headers["Authorization"] = f"Bearer {AMD_INFERENCE_TOKEN}"
-    try:
-        async with httpx.AsyncClient(timeout=AMD_TIMEOUT) as client:
-            resp = await client.post(url, json=payload, headers=headers)
-            resp.raise_for_status()
-            data = resp.json()
-            return data["choices"][0]["message"]["content"]
-    except httpx.ConnectError:
-        return None  # Server not reachable → use mock
-    except httpx.TimeoutException:
-        return None  # Server too slow → use mock
-    except Exception:
-        return None  # Any other error → use mock
 # ── Agent runner ─────────────────────────────────────────────────────────────

 # Token for the AMD inference server (if required)
 AMD_INFERENCE_TOKEN = os.environ.get(
     "AMD_INFERENCE_TOKEN",
+    "5peRa6unb0DdXvzB3Pbck48IgNTDmxeJSUvE4NdnhvW70FcaX"
 )
 # The model name vLLM is serving (used in the chat/completions request).
 # ── System prompts ───────────────────────────────────────────────────────────
 INSPECTOR_SYSTEM = """You are the INSPECTOR agent of ForgeSight — a multimodal quality-control copilot
+running on AMD Instinct MI300X + ROCm. Your job: analyze the submitted construction site, road infrastructure, or housing
+image and surface visible structural defects, safety hazards, anomalies, or code violations.
 Return ONLY compact JSON with this exact shape (no prose, no code fences):
 {
   "verdict": "pass" | "warn" | "fail",
   "confidence": 0.0-1.0,
   "defects": [
+    {"type": "short category e.g. structural-crack", "severity": "low|medium|high", "location": "short spatial description", "description": "one sentence"}
   ],
   "observation": "2-3 sentence plain-english summary of what you see"
 }
+Be precise. If the image shows no construction/infrastructure issues at all, still describe what is visible
 and mark verdict "warn" with a defect explaining the mismatch."""
 {
   "probable_cause": "one-sentence most likely cause",
   "contributing_factors": ["factor 1", "factor 2", "factor 3"],
+  "affected_process_step": "e.g. concrete pouring, asphalt laying, framing"
 }
 Be concrete and industry-literate."""
 Return ONLY compact JSON:
 {
   "priority": "P0|P1|P2|P3",
+  "assignee_role": "e.g. site-manager, structural-engineer, safety-officer",
   "steps": ["step 1", "step 2", "step 3"],
   "estimated_minutes": integer,
   "parts_or_tools": ["item 1", "item 2"]
     mocks = {
         "inspector": {
             "verdict": "warn", "confidence": 0.85,
+            "defects": [{"type": "concrete-crack", "severity": "medium",
+                         "location": "foundation wall, sector B", "description": "Diagonal hairline crack visible"}],
+            "observation": "Diagonal crack detected on the concrete foundation. [LOCAL MOCK — AMD server offline]"
         },
         "diagnostician": {
+            "probable_cause": "Improper curing or settlement issues. [LOCAL MOCK]",
+            "contributing_factors": ["Temperature fluctuation", "Soil settlement"],
+            "affected_process_step": "Concrete curing"
         },
         "action": {
+            "priority": "P2", "assignee_role": "structural-engineer",
+            "steps": ["Assess crack depth", "Apply epoxy injection"],
+            "estimated_minutes": 120, "parts_or_tools": ["Epoxy resin", "Measurement gauge"]
         },
         "reporter": {
+            "headline": "Foundation Crack Detected [Mock]",
             "summary": "Local mock response — start the AMD vLLM server to use the fine-tuned model.",
+            "tags": ["crack", "concrete", "mock"]
         },
         "social": {
             "x_post": "Testing our pipeline #AMDHackathon",
         "temperature": 0.1,  # Low temperature for deterministic structured output
     }
+    # Candidate endpoints
+    base_url = AMD_INFERENCE_URL.rstrip("/")
+    candidates = [
+        f"{base_url}/proxy/8000/v1/chat/completions",
+        f"{base_url}/proxy/8001/v1/chat/completions",
+        f"{base_url}:8000/v1/chat/completions",
+        f"{base_url}:8001/v1/chat/completions",
+        f"{base_url}/v1/chat/completions",
+    ]
     headers = {}
     if AMD_INFERENCE_TOKEN:
+        # Try both token and Bearer formats
+        headers["Authorization"] = f"token {AMD_INFERENCE_TOKEN}"
+    last_err = None
+    for url in candidates:
+        try:
+            async with httpx.AsyncClient(timeout=AMD_TIMEOUT) as client:
+                # Add token as param too just in case
+                test_url = f"{url}?token={AMD_INFERENCE_TOKEN}" if AMD_INFERENCE_TOKEN else url
+                resp = await client.post(test_url, json=payload, headers=headers)
+                if resp.status_code == 200:
+                    data = resp.json()
+                    return data["choices"][0]["message"]["content"]
+                # Try Bearer if token failed
+                headers["Authorization"] = f"Bearer {AMD_INFERENCE_TOKEN}"
+                resp = await client.post(test_url, json=payload, headers=headers)
+                if resp.status_code == 200:
+                    data = resp.json()
+                    return data["choices"][0]["message"]["content"]
+        except Exception as e:
+            last_err = e
+            continue
+    return None  # All candidates failed
 # ── Agent runner ─────────────────────────────────────────────────────────────

frontend/src/pages/Console.jsx CHANGED Viewed

@@ -77,7 +77,7 @@ export default function Console() {
           Inspection Console
         </h1>
         <p className="text-zinc-400 mt-3 max-w-2xl">
-          Upload a product / assembly-line image. Four agents will collaborate to deliver a verdict.
         </p>
       </header>
@@ -134,23 +134,23 @@ export default function Console() {
             <div className="mt-5 space-y-3">
               <div>
-                <div className="fs-label mb-2">Operator Notes</div>
                 <textarea
                   value={notes}
                   onChange={(e) => setNotes(e.target.value)}
                   rows={2}
-                  placeholder="e.g. batch B-124, shift 2, CNC line 3…"
                   className="w-full bg-[#0A0A0A] border border-white/10 focus:border-[#ED1C24] outline-none px-3 py-2 font-mono text-sm text-white placeholder-zinc-600"
                   data-testid="notes-input"
                 />
               </div>
               <div>
-                <div className="fs-label mb-2">Product Spec (optional)</div>
                 <textarea
                   value={spec}
                   onChange={(e) => setSpec(e.target.value)}
                   rows={2}
-                  placeholder="e.g. aluminum 6061 bracket, max surface defect 0.2mm…"
                   className="w-full bg-[#0A0A0A] border border-white/10 focus:border-[#ED1C24] outline-none px-3 py-2 font-mono text-sm text-white placeholder-zinc-600"
                   data-testid="spec-input"
                 />

           Inspection Console
         </h1>
         <p className="text-zinc-400 mt-3 max-w-2xl">
+          Upload a construction site, road infrastructure, or housing image. Four agents will collaborate to deliver a structural or safety verdict.
         </p>
       </header>
             <div className="mt-5 space-y-3">
               <div>
+                <div className="fs-label mb-2">Inspector Notes</div>
                 <textarea
                   value={notes}
                   onChange={(e) => setNotes(e.target.value)}
                   rows={2}
+                  placeholder="e.g. site 4, highway foundation, sector B…"
                   className="w-full bg-[#0A0A0A] border border-white/10 focus:border-[#ED1C24] outline-none px-3 py-2 font-mono text-sm text-white placeholder-zinc-600"
                   data-testid="notes-input"
                 />
               </div>
               <div>
+                <div className="fs-label mb-2">Building/Civil Spec (optional)</div>
                 <textarea
                   value={spec}
                   onChange={(e) => setSpec(e.target.value)}
                   rows={2}
+                  placeholder="e.g. concrete grade C30, max surface crack 0.2mm…"
                   className="w-full bg-[#0A0A0A] border border-white/10 focus:border-[#ED1C24] outline-none px-3 py-2 font-mono text-sm text-white placeholder-zinc-600"
                   data-testid="spec-input"
                 />

hf_space/agents.py CHANGED Viewed

@@ -25,7 +25,7 @@ AMD_INFERENCE_URL = os.environ.get(
 # Token for the AMD inference server (if required)
 AMD_INFERENCE_TOKEN = os.environ.get(
     "AMD_INFERENCE_TOKEN",
-    "DiPipPSZoxb96rcrP7X+B0N5mTTEzxU/ziesgI/Z2NPo9xPKM"
 )
 # The model name vLLM is serving (used in the chat/completions request).
@@ -37,19 +37,19 @@ AMD_TIMEOUT = float(os.environ.get("AMD_TIMEOUT", "60"))
 # ── System prompts ───────────────────────────────────────────────────────────
 INSPECTOR_SYSTEM = """You are the INSPECTOR agent of ForgeSight — a multimodal quality-control copilot
-running on AMD Instinct MI300X + ROCm. Your job: analyze the submitted product/assembly-line
-image and surface visible defects, anomalies, or violations.
 Return ONLY compact JSON with this exact shape (no prose, no code fences):
 {
   "verdict": "pass" | "warn" | "fail",
   "confidence": 0.0-1.0,
   "defects": [
-    {"type": "short category e.g. surface-scratch", "severity": "low|medium|high", "location": "short spatial description", "description": "one sentence"}
   ],
   "observation": "2-3 sentence plain-english summary of what you see"
 }
-Be precise. If the image shows no manufacturing artifact at all, still describe what is visible
 and mark verdict "warn" with a defect explaining the mismatch."""
@@ -60,7 +60,7 @@ Return ONLY compact JSON:
 {
   "probable_cause": "one-sentence most likely cause",
   "contributing_factors": ["factor 1", "factor 2", "factor 3"],
-  "affected_process_step": "e.g. CNC milling, injection cooling, weld pass 2"
 }
 Be concrete and industry-literate."""
@@ -71,7 +71,7 @@ outputs, draft an actionable work order.
 Return ONLY compact JSON:
 {
   "priority": "P0|P1|P2|P3",
-  "assignee_role": "e.g. line-lead, maintenance-tech, quality-engineer",
   "steps": ["step 1", "step 2", "step 3"],
   "estimated_minutes": integer,
   "parts_or_tools": ["item 1", "item 2"]
@@ -118,24 +118,24 @@ def _mock_response(name: str) -> Dict[str, Any]:
     mocks = {
         "inspector": {
             "verdict": "warn", "confidence": 0.85,
-            "defects": [{"type": "surface-scratch", "severity": "low",
-                         "location": "top-left edge", "description": "Minor scratch visible"}],
-            "observation": "Minor scratch detected on surface. [LOCAL MOCK — AMD server offline]"
         },
         "diagnostician": {
-            "probable_cause": "Improper handling during milling. [LOCAL MOCK]",
-            "contributing_factors": ["Machine calibration", "Operator error"],
-            "affected_process_step": "CNC milling"
         },
         "action": {
-            "priority": "P2", "assignee_role": "quality-engineer",
-            "steps": ["Inspect machine", "Recalibrate"],
-            "estimated_minutes": 30, "parts_or_tools": ["Calibration kit"]
         },
         "reporter": {
-            "headline": "Minor Scratch Detected [Mock]",
             "summary": "Local mock response — start the AMD vLLM server to use the fine-tuned model.",
-            "tags": ["scratch", "mock", "local"]
         },
         "social": {
             "x_post": "Testing our pipeline #AMDHackathon",
@@ -188,9 +188,11 @@ async def _call_amd_vllm(
     # Candidate endpoints
     base_url = AMD_INFERENCE_URL.rstrip("/")
     candidates = [
-        f"{base_url}/v1/chat/completions",
         f"{base_url}/proxy/8000/v1/chat/completions",
         f"{base_url}:8000/v1/chat/completions",
     ]
     headers = {}

 # Token for the AMD inference server (if required)
 AMD_INFERENCE_TOKEN = os.environ.get(
     "AMD_INFERENCE_TOKEN",
+    "5peRa6unb0DdXvzB3Pbck48IgNTDmxeJSUvE4NdnhvW70FcaX"
 )
 # The model name vLLM is serving (used in the chat/completions request).
 # ── System prompts ───────────────────────────────────────────────────────────
 INSPECTOR_SYSTEM = """You are the INSPECTOR agent of ForgeSight — a multimodal quality-control copilot
+running on AMD Instinct MI300X + ROCm. Your job: analyze the submitted construction site, road infrastructure, or housing
+image and surface visible structural defects, safety hazards, anomalies, or code violations.
 Return ONLY compact JSON with this exact shape (no prose, no code fences):
 {
   "verdict": "pass" | "warn" | "fail",
   "confidence": 0.0-1.0,
   "defects": [
+    {"type": "short category e.g. structural-crack", "severity": "low|medium|high", "location": "short spatial description", "description": "one sentence"}
   ],
   "observation": "2-3 sentence plain-english summary of what you see"
 }
+Be precise. If the image shows no construction/infrastructure issues at all, still describe what is visible
 and mark verdict "warn" with a defect explaining the mismatch."""
 {
   "probable_cause": "one-sentence most likely cause",
   "contributing_factors": ["factor 1", "factor 2", "factor 3"],
+  "affected_process_step": "e.g. concrete pouring, asphalt laying, framing"
 }
 Be concrete and industry-literate."""
 Return ONLY compact JSON:
 {
   "priority": "P0|P1|P2|P3",
+  "assignee_role": "e.g. site-manager, structural-engineer, safety-officer",
   "steps": ["step 1", "step 2", "step 3"],
   "estimated_minutes": integer,
   "parts_or_tools": ["item 1", "item 2"]
     mocks = {
         "inspector": {
             "verdict": "warn", "confidence": 0.85,
+            "defects": [{"type": "concrete-crack", "severity": "medium",
+                         "location": "foundation wall, sector B", "description": "Diagonal hairline crack visible"}],
+            "observation": "Diagonal crack detected on the concrete foundation. [LOCAL MOCK — AMD server offline]"
         },
         "diagnostician": {
+            "probable_cause": "Improper curing or settlement issues. [LOCAL MOCK]",
+            "contributing_factors": ["Temperature fluctuation", "Soil settlement"],
+            "affected_process_step": "Concrete curing"
         },
         "action": {
+            "priority": "P2", "assignee_role": "structural-engineer",
+            "steps": ["Assess crack depth", "Apply epoxy injection"],
+            "estimated_minutes": 120, "parts_or_tools": ["Epoxy resin", "Measurement gauge"]
         },
         "reporter": {
+            "headline": "Foundation Crack Detected [Mock]",
             "summary": "Local mock response — start the AMD vLLM server to use the fine-tuned model.",
+            "tags": ["crack", "concrete", "mock"]
         },
         "social": {
             "x_post": "Testing our pipeline #AMDHackathon",
     # Candidate endpoints
     base_url = AMD_INFERENCE_URL.rstrip("/")
     candidates = [
         f"{base_url}/proxy/8000/v1/chat/completions",
+        f"{base_url}/proxy/8001/v1/chat/completions",
         f"{base_url}:8000/v1/chat/completions",
+        f"{base_url}:8001/v1/chat/completions",
+        f"{base_url}/v1/chat/completions",
     ]
     headers = {}

hf_space/app.py CHANGED Viewed

@@ -201,20 +201,24 @@ async def api_get_telemetry():
     # Candidate endpoints
     base_url = AMD_INFERENCE_URL.rstrip("/")
     candidates = [
-        f"{base_url}/v1/models",
         f"{base_url}/proxy/8000/v1/models",
         f"{base_url}:8000/v1/models",
     ]
     headers = {}
     if AMD_INFERENCE_TOKEN:
         headers["Authorization"] = f"token {AMD_INFERENCE_TOKEN}"
     last_err = None
     success_url = None
     for url in candidates:
         try:
-            async with httpx.AsyncClient(timeout=2.0) as client:
                 test_url = f"{url}?token={AMD_INFERENCE_TOKEN}" if AMD_INFERENCE_TOKEN else url
                 resp = await client.get(test_url, headers=headers)
                 if resp.status_code == 200:

     # Candidate endpoints
     base_url = AMD_INFERENCE_URL.rstrip("/")
     candidates = [
         f"{base_url}/proxy/8000/v1/models",
+        f"{base_url}/proxy/8001/v1/models",
         f"{base_url}:8000/v1/models",
+        f"{base_url}:8001/v1/models",
+        f"{base_url}/v1/models",
     ]
     headers = {}
     if AMD_INFERENCE_TOKEN:
+        # Use BOTH header formats for compatibility
         headers["Authorization"] = f"token {AMD_INFERENCE_TOKEN}"
     last_err = None
     success_url = None
     for url in candidates:
         try:
+            # Increase timeout to 5s for remote server wake-up
+            async with httpx.AsyncClient(timeout=5.0) as client:
                 test_url = f"{url}?token={AMD_INFERENCE_TOKEN}" if AMD_INFERENCE_TOKEN else url
                 resp = await client.get(test_url, headers=headers)
                 if resp.status_code == 200:

hf_space/deploy.ps1 CHANGED Viewed

@@ -1,16 +1,18 @@
 # Deploy ForgeSight to Hugging Face Spaces
 # Run this from the project root: c:\Users\user\OneDrive\Desktop\hans\hans
-# 1. Clone the HF Space repo (if not already done)
-git clone https://huggingface.co/spaces/rasAli02/ForgeSight hf_space_repo
-# 2. Copy all deployment files into the cloned repo
-Copy-Item hf_space\* hf_space_repo\ -Force
 # 3. Push to HF Spaces
 Set-Location hf_space_repo
 git add -A
-git commit -m "Deploy ForgeSight Gradio backend with AMD MI300X agent pipeline"
 git push
 # After push, the space will build and start at:

 # Deploy ForgeSight to Hugging Face Spaces
 # Run this from the project root: c:\Users\user\OneDrive\Desktop\hans\hans
+# 1. Clone/Update the HF Space repo
+if (!(Test-Path hf_space_repo)) {
+    git clone https://huggingface.co/spaces/lablab-ai-amd-developer-hackathon/ForgeSight hf_space_repo
+}
+# 2. Copy all deployment files recursively into the cloned repo
+Copy-Item -Path "hf_space\*" -Destination "hf_space_repo\" -Recurse -Force
 # 3. Push to HF Spaces
 Set-Location hf_space_repo
 git add -A
+git commit -m "🚀 ForgeSight: Enhanced AMD MI300X connectivity with Smart Discovery"
 git push
 # After push, the space will build and start at:

hf_space_repo CHANGED Viewed

	@@ -1 +1 @@
1	- Subproject commit ~~5afad5017a9c8584dd462568837d8fa95ebfe1d1~~


1	+ Subproject commit cd7763c71b57e8793ec1ea03754298080022b34f