Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -19,29 +19,51 @@ tags:
|
|
| 19 |
- agents
|
| 20 |
---
|
| 21 |
|
| 22 |
-
# π ForgeSight β Multimodal
|
| 23 |
|
| 24 |
-
ForgeSight
|
| 25 |
-
diagnoses root cause, drafts work orders, and publishes reports β fine-tuned
|
| 26 |
-
on **Qwen2-VL** and served on **AMD Instinct MI300X** via ROCm + vLLM.
|
| 27 |
|
| 28 |
-
##
|
| 29 |
|
| 30 |
-
|
| 31 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
```
|
| 33 |
|
| 34 |
-
###
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
|
| 36 |
-
|
| 37 |
-
2. **Diagnostician** β Root-cause analysis
|
| 38 |
-
3. **Action** β Work order generation
|
| 39 |
-
4. **Reporter** β Human-readable summary
|
| 40 |
|
| 41 |
-
|
|
|
|
|
|
|
|
|
|
| 42 |
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
- **Track 3**: Multimodal vision (Qwen2-VL)
|
| 46 |
|
| 47 |
-
|
|
|
|
|
|
| 19 |
- agents
|
| 20 |
---
|
| 21 |
|
| 22 |
+
# π ForgeSight β Multimodal QC Copilot on AMD Instinctβ’ MI300X
|
| 23 |
|
| 24 |
+
ForgeSight is a production-ready **Agentic Quality Control (QC) Pipeline** designed for high-throughput manufacturing environments. Built exclusively for the **AMD + lablab.ai Developer Hackathon**, it leverages the massive 192GB VRAM of the **AMD Instinct MI300X** to run a state-of-the-art multimodal multi-agent workflow.
|
|
|
|
|
|
|
| 25 |
|
| 26 |
+
## π Key Features
|
| 27 |
|
| 28 |
+
* **Multimodal Reasoning**: Uses **Qwen2-VL-7B** to "see" and understand complex assembly line defects in a single forward pass.
|
| 29 |
+
* **4-Agent Pipeline**: Chained reasoning workflow:
|
| 30 |
+
1. **Inspector** β Identifies surface defects, anomalies, and violations.
|
| 31 |
+
2. **Diagnostician** β Performs industry-literate root-cause analysis.
|
| 32 |
+
3. **Action** β Generates prioritized work orders and tool checklists.
|
| 33 |
+
4. **Reporter** β Summarizes findings into human-readable executive reports.
|
| 34 |
+
* **MI300X Optimized**: Served via **vLLM on ROCm**, utilizing continuous batching and paged attention for near-instant inference.
|
| 35 |
+
* **Audit-Ready**: Generates downloadable **PDF QC Audit Reports** for every inspection.
|
| 36 |
+
* **Persistent Data**: Integrated with **MongoDB Atlas** for long-term defect tracking and telemetry history.
|
| 37 |
+
|
| 38 |
+
## ποΈ Technical Architecture
|
| 39 |
+
|
| 40 |
+
```mermaid
|
| 41 |
+
graph TD
|
| 42 |
+
A[React Dashboard] --> B[FastAPI Gateway]
|
| 43 |
+
B --> C[Gradio Admin Console]
|
| 44 |
+
B --> D[4-Agent Pipeline]
|
| 45 |
+
D --> E[AMD MI300X Inference Server]
|
| 46 |
+
E --> F[vLLM / ROCm]
|
| 47 |
+
F --> G[Qwen2-VL-7B-Instruct]
|
| 48 |
+
B --> H[MongoDB Atlas]
|
| 49 |
+
B --> I[PDF Generator]
|
| 50 |
```
|
| 51 |
|
| 52 |
+
### Stack
|
| 53 |
+
- **Hardware**: AMD Instinct MI300X (192GB HBM3)
|
| 54 |
+
- **Software**: ROCm 6.2, PyTorch 2.4, vLLM
|
| 55 |
+
- **Frontend**: React 18, Tailwind CSS, Recharts
|
| 56 |
+
- **Backend**: FastAPI, Gradio, Python 3.10
|
| 57 |
|
| 58 |
+
## π οΈ Installation & Setup
|
|
|
|
|
|
|
|
|
|
| 59 |
|
| 60 |
+
1. **Clone the Repo**: `git clone https://github.com/rasali535/hans.git`
|
| 61 |
+
2. **Install Deps**: `pip install -r requirements.txt`
|
| 62 |
+
3. **Configure Environment**: Set `AMD_INFERENCE_URL` and `AMD_INFERENCE_TOKEN` in your `.env`.
|
| 63 |
+
4. **Launch**: `python hf_space/app.py`
|
| 64 |
|
| 65 |
+
## π Performance on AMD
|
| 66 |
+
The MI300X's 5.3 TB/s bandwidth allows ForgeSight to maintain **>2500 tokens/sec** throughput, enabling real-time visual inspection of high-speed manufacturing lines without the latency typical of cloud-based VLM APIs.
|
|
|
|
| 67 |
|
| 68 |
+
---
|
| 69 |
+
Built by **Hans** for the **AMD Developer Hackathon**.
|