metadata
title: ForgeSight
emoji: ๐
colorFrom: red
colorTo: gray
sdk: gradio
sdk_version: 5.29.1
app_file: app.py
pinned: true
license: mit
short_description: Multimodal QC Copilot on AMD MI300X + ROCm
tags:
- amd
- rocm
- mi300x
- qwen
- vllm
- quality-control
- agents
๐ ForgeSight โ Multimodal QC Copilot on AMD Instinctโข MI300X
ForgeSight is a production-ready Agentic Quality Control (QC) Pipeline designed for high-throughput manufacturing environments. Built exclusively for the AMD + lablab.ai Developer Hackathon, it leverages the massive 192GB VRAM of the AMD Instinct MI300X to run a state-of-the-art multimodal multi-agent workflow.
๐ Key Features
- Multimodal Reasoning: Uses Qwen2-VL-7B to "see" and understand complex assembly line defects in a single forward pass.
- 4-Agent Pipeline: Chained reasoning workflow:
- Inspector โ Identifies surface defects, anomalies, and violations.
- Diagnostician โ Performs industry-literate root-cause analysis.
- Action โ Generates prioritized work orders and tool checklists.
- Reporter โ Summarizes findings into human-readable executive reports.
- MI300X Optimized: Served via vLLM on ROCm, utilizing continuous batching and paged attention for near-instant inference.
- Audit-Ready: Generates downloadable PDF QC Audit Reports for every inspection.
- Persistent Data: Integrated with MongoDB Atlas for long-term defect tracking and telemetry history.
๐๏ธ Technical Architecture
graph TD
A[React Dashboard] --> B[FastAPI Gateway]
B --> C[Gradio Admin Console]
B --> D[4-Agent Pipeline]
D --> E[AMD MI300X Inference Server]
E --> F[vLLM / ROCm]
F --> G[Qwen2-VL-7B-Instruct]
B --> H[MongoDB Atlas]
B --> I[PDF Generator]
Stack
- Hardware: AMD Instinct MI300X (192GB HBM3)
- Software: ROCm 6.2, PyTorch 2.4, vLLM
- Frontend: React 18, Tailwind CSS, Recharts
- Backend: FastAPI, Gradio, Python 3.10
๐ ๏ธ Installation & Setup
- Clone the Repo:
git clone https://github.com/rasali535/hans.git - Install Deps:
pip install -r requirements.txt - Configure Environment: Set
AMD_INFERENCE_URLandAMD_INFERENCE_TOKENin your.env. - Launch:
python hf_space/app.py
๐ Performance on AMD
The MI300X's 5.3 TB/s bandwidth allows ForgeSight to maintain >2500 tokens/sec throughput, enabling real-time visual inspection of high-speed manufacturing lines without the latency typical of cloud-based VLM APIs.
Built by Hans for the AMD Developer Hackathon.