Spaces:
Sleeping
Sleeping
Upload folder using huggingface_hub
Browse files- .gitattributes +2 -0
- .gitignore +1 -1
- README.md +72 -36
- TB-Guard-XAI.png +3 -0
- qdrant_db/.lock +1 -0
- qdrant_db/collection/tb_medical_knowledge/storage.sqlite +3 -0
- qdrant_db/meta.json +1 -0
.gitattributes
CHANGED
|
@@ -5,5 +5,7 @@ docs/WHO_Diagnosis_Guidelines.pdf filter=lfs diff=lfs merge=lfs -text
|
|
| 5 |
docs/WHO_Guidelines.pdf filter=lfs diff=lfs merge=lfs -text
|
| 6 |
docs/WHO_TB_2025.pdf filter=lfs diff=lfs merge=lfs -text
|
| 7 |
docs/WHO_TB_Screening_Module2_2021.pdf filter=lfs diff=lfs merge=lfs -text
|
|
|
|
| 8 |
static/demo/complex.png filter=lfs diff=lfs merge=lfs -text
|
| 9 |
static/demo/healthy.png filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
| 5 |
docs/WHO_Guidelines.pdf filter=lfs diff=lfs merge=lfs -text
|
| 6 |
docs/WHO_TB_2025.pdf filter=lfs diff=lfs merge=lfs -text
|
| 7 |
docs/WHO_TB_Screening_Module2_2021.pdf filter=lfs diff=lfs merge=lfs -text
|
| 8 |
+
qdrant_db/collection/tb_medical_knowledge/storage.sqlite filter=lfs diff=lfs merge=lfs -text
|
| 9 |
static/demo/complex.png filter=lfs diff=lfs merge=lfs -text
|
| 10 |
static/demo/healthy.png filter=lfs diff=lfs merge=lfs -text
|
| 11 |
+
TB-Guard-XAI.png filter=lfs diff=lfs merge=lfs -text
|
.gitignore
CHANGED
|
@@ -22,7 +22,7 @@ temp_uploads/
|
|
| 22 |
archive/
|
| 23 |
|
| 24 |
# Qdrant DB
|
| 25 |
-
qdrant_db/
|
| 26 |
|
| 27 |
# IDE
|
| 28 |
.vscode/
|
|
|
|
| 22 |
archive/
|
| 23 |
|
| 24 |
# Qdrant DB
|
| 25 |
+
# qdrant_db/ (Commented out so it uploads to Hugging Face)
|
| 26 |
|
| 27 |
# IDE
|
| 28 |
.vscode/
|
README.md
CHANGED
|
@@ -1,57 +1,86 @@
|
|
| 1 |
-
---
|
| 2 |
-
title: TB Guard XAI
|
| 3 |
-
emoji: ๐ฅ
|
| 4 |
-
colorFrom: blue
|
| 5 |
-
colorTo: indigo
|
| 6 |
-
sdk: docker
|
| 7 |
-
pinned: false
|
| 8 |
-
---
|
| 9 |
# ๐ซ TB-Guard-XAI: Explainable AI Triage for Mass Tuberculosis Screening
|
| 10 |
|
| 11 |
-
**Mistral AI Worldwide Hackathon 2026
|
| 12 |
|
| 13 |
-
TB-Guard-XAI is an
|
| 14 |
|
| 15 |
-
|
|
|
|
|
|
|
| 16 |
|
| 17 |
-
 screening to bridge the massive gap in healthcare personnel.
|
| 23 |
|
| 24 |
-
**
|
|
|
|
|
|
|
| 25 |
|
| 26 |
---
|
| 27 |
|
| 28 |
-
## ๐ง
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
|
| 30 |
-
|
| 31 |
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
|
|
|
|
|
|
| 37 |
|
| 38 |
---
|
| 39 |
|
| 40 |
-
##
|
|
|
|
| 41 |
|
| 42 |
-
|
| 43 |
-
* **PyTorch CNN Ensemble:** Combines DenseNet121, EfficientNet-B4, and ResNet50 trained across 6 global datasets (Shenzhen, Montgomery, etc.).
|
| 44 |
-
* **Monte Carlo Dropout (Bayesian Uncertainty):** The neural network runs predictions 20 times per image. If it hallucinates or guesses, the mathematical variance flags the result as **"Unreliable โ Review Required"**, overriding the base percentage.
|
| 45 |
-
* **Grad-CAM Heatmaps:** Generates a topological color map precisely showing the physician exactly *where* the AI found the infection in the lung.
|
| 46 |
|
| 47 |
---
|
| 48 |
|
| 49 |
-
## ๐ ๏ธ
|
| 50 |
|
| 51 |
-
### 1.
|
| 52 |
Ensure you have Python 3.10+ installed.
|
| 53 |
```bash
|
| 54 |
-
git clone https://github.com/
|
| 55 |
cd TB-Guard-XAI
|
| 56 |
python -m venv venv
|
| 57 |
source venv/bin/activate # On Windows: venv\Scripts\activate
|
|
@@ -59,20 +88,27 @@ pip install -r requirements.txt
|
|
| 59 |
```
|
| 60 |
|
| 61 |
### 2. Environment Variables
|
| 62 |
-
You need your Mistral API key to run the
|
| 63 |
```env
|
| 64 |
MISTRAL_API_KEY=your_mistral_key_here
|
| 65 |
```
|
| 66 |
|
| 67 |
-
### 3. Run the Server
|
| 68 |
```bash
|
| 69 |
python backend.py
|
| 70 |
```
|
| 71 |
-
*Open your browser to `http://127.0.0.1:8000` to access the
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 72 |
|
| 73 |
---
|
| 74 |
|
| 75 |
-
## ๐ Clinical Disclaimer
|
| 76 |
-
**Not for self-diagnosis.** TB-Guard-XAI is an experimental clinical decision-support tool built for
|
| 77 |
|
| 78 |
-
> Built with โค๏ธ for
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# ๐ซ TB-Guard-XAI: Explainable AI Triage for Mass Tuberculosis Screening
|
| 2 |
|
| 3 |
+
**Built for the Mistral AI Worldwide Hackathon 2026**
|
| 4 |
|
| 5 |
+
> TB-Guard-XAI is an explainable, multimodal clinical triage engine. Uniting PyTorch deep learning with Mistral, Bayesian Uncertainty mathematically detects AI "guessing," while Grad-CAM heatmaps highlight infections. Mistral Vision adds a 2nd opinion, Voxtral transcribes voice, and RAG outputs MedGemma-safe, WHO-backed clinical reports.
|
| 6 |
|
| 7 |
+
[](https://huggingface.co/spaces/mistral-hackaton-2026/TB-Guard-XAI)
|
| 8 |
+
[](https://youtu.be/UyxZCp2q7TM)
|
| 9 |
+
[](https://opensource.org/licenses/MIT)
|
| 10 |
|
| 11 |
+

|
| 12 |
|
| 13 |
---
|
| 14 |
|
| 15 |
+
## ๐ The Clinical Problem
|
| 16 |
+
Tuberculosis kills 1.3 million people annually, with 87% of cases occurring in low-resource settings. The WHO explicitly endorses AI-assisted Chest X-Ray (CXR) screening to bridge the massive gap in healthcare personnel.
|
| 17 |
|
| 18 |
+
**The Flaw in Current AI:** Existing medical AI models are *"black boxes"*. They output a rigid probability (e.g., "95% TB") using standard softmax functions. This results in **false overconfidence**. If given an obscure anomaly, traditional AI will confidently hallucinate a diagnosis because it lacks the mathematical capacity to say, *"I don't know."* Furthermore, they provide no explanation for *why* they made the decision, making them unsafe for autonomous triage.
|
| 19 |
+
|
| 20 |
+
**Our Mission:** Build an AI system that knows *why* it made a decision, mathematically calculates *when* it is out of its depth, and orchestrates the Mistral AI ecosystem to explain its reasoning exactly as a human doctor would.
|
| 21 |
|
| 22 |
---
|
| 23 |
|
| 24 |
+
## ๐ง The Architecture & Tech Stack Justification
|
| 25 |
+
TB-Guard-XAI is not a simple wrapper around an LLM. It is a highly engineered, multi-agent pipeline bridging deterministic Deep Learning with non-deterministic Generative AI.
|
| 26 |
+
|
| 27 |
+
### 1. The Mistral AI Ecosystem (The Brains)
|
| 28 |
+
We utilized almost the entire suite of Mistral's latest models, assigning them specialized agentic roles:
|
| 29 |
+
|
| 30 |
+
* **๐๏ธ Mistral Vision (`mistral-large-latest`): The Second Opinion.**
|
| 31 |
+
* *Why this?* Instead of relying solely on our PyTorch CNN, we pass the compressed X-Ray directly to Mistral Large. It acts as an independent radiologist, cross-verifying the mathematical coordinates found by PyTorch and hunting for contextual clues like Lymphadenopathy or Cavitations.
|
| 32 |
+
* **๐๏ธ Voxtral Audio (`voxtral-mini-latest`): Acoustic Context.**
|
| 33 |
+
* *Why this?* Rural clinics are chaotic. Technicians don't have time to type. Voxtral ingests spoken symptoms ("Patient has night sweats") and transcribes them instantly.
|
| 34 |
+
* **๐ก๏ธ Mistral Router (`mistral-small-latest`): The Safety Gatekeeper.**
|
| 35 |
+
* *Why this?* We use Mistral Small for zero-latency, ultra-cheap intent classification. It intercepts the transcribed voice notes. If a patient describes a broken ankle, Mistral Small instantly blocks the query for violating the Respiratory domain, preserving clinical compliance.
|
| 36 |
+
* **๐ Mistral RAG Reasoner (`mistral-large-latest`): Clinical Synthesis.**
|
| 37 |
+
* *Why this?* Mistral Large possesses exceptional native tool-calling. It dynamically queries our Qdrant Vector Database (loaded with WHO TB Guidelines) and fuses the RAG evidence, Mistral Vision's visual assessment, and PyTorch's mathematical probabilities into a cohesive, structured Medical Report.
|
| 38 |
+
* **โ๏ธ MedGemma: End-of-Line Validation.**
|
| 39 |
+
* *Why this?* Used as a secondary open-weight safety validator to ensure the final generated advice does not provide definitive medical diagnoses, keeping the tool strictly as "Decision Support."
|
| 40 |
+
|
| 41 |
+
### 2. The Deep Learning Engine (The Eyes & The Math)
|
| 42 |
+
Beneath the LLMs lies a robust computer vision pipeline designed for maximum explainability.
|
| 43 |
+
|
| 44 |
+
* **Convoluted Neural Network (CNN) Ensemble**
|
| 45 |
+
* *What is it?* A parallel architecture fusing DenseNet121, EfficientNet-B4, and ResNet50.
|
| 46 |
+
* *Why average them?* Single models inherit inherent dataset biases. By ensembling three distinct architectures, we eliminate distinct blind spots. Furthermore, they are trained on 6 distinct global CXR datasets (Shenzhen, Montgomery, etc.) to ensure ethnic and anatomical generalization.
|
| 47 |
+
* **Bayesian Deep Learning: Monte Carlo (MC) Dropout**
|
| 48 |
+
* *What is it?* The crown jewel of our safety mechanism. Standard AI evaluates an image once. MC Dropout forces our neural network to evaluate the same X-Ray **20 different times**, randomly turning off ("dropping out") different neurons during each pass.
|
| 49 |
+
* *Why use it?* If the model is recognizing true TB features, the 20 predictions will be nearly identical (Low Variance). But if the model is guessing on an anomalous image, the 20 predictions will wildly disagree (High Variance). When high variance is detected, the system overrides the probability and flags **"Unreliable โ Human Review Required,"** legally protecting the clinic from false AI confidence.
|
| 50 |
+
* **Explainable AI: Grad-CAM (Gradient-weighted Class Activation Mapping)**
|
| 51 |
+
* *What is it?* An algorithm that traces the classification logic backwards through the CNN to find exactly which pixels activated the "Tuberculosis" neurons.
|
| 52 |
+
* *Why use it?* It generates a topological heatmap over the X-Ray. Doctors don't have to trust the AI blindly; they can physically see exactly what the AI is looking at.
|
| 53 |
+
|
| 54 |
+
### 3. The Infrastructure Pipeline
|
| 55 |
+
* **FastAPI (Backend):** Chosen over Flask/Django for its asynchronous performance capability, crucial for handling concurrent PyTorch inference, Mistral tool-calling, and Audio processing simultaneously.
|
| 56 |
+
* **Qdrant (Vector Database):** Chosen over Pinecone/Milvus for its incredible local-deployment capability and dense vector search speeds, serving our WHO RAG context instantly.
|
| 57 |
+
* **Vanilla HTML/JS + Tailwind (Frontend):** We specifically avoided heavy React/Next.js frameworks to guarantee the UI could run on extremely low-end, low-RAM hospital registry computers with zero dependency bloat.
|
| 58 |
|
| 59 |
+
---
|
| 60 |
|
| 61 |
+
## ๐ก Key Features at a Glance
|
| 62 |
+
* **Drag-and-Drop X-Ray Analysis** with Native Bayesian Uncertainty bounds.
|
| 63 |
+
* **Mistral Vision Multimodal Verification** natively embedded in the UI.
|
| 64 |
+
* **Voice-Activated Clinical Context** powered by Voxtral.
|
| 65 |
+
* **Grad-CAM Topological Visualizations.**
|
| 66 |
+
* **Built-in AI Respiratory Chatbot.**
|
| 67 |
+
* **One-Click Printable PDF Triage Reports** for lab handover.
|
| 68 |
|
| 69 |
---
|
| 70 |
|
| 71 |
+
## ๐ Live Deployment
|
| 72 |
+
TB-Guard-XAI is packaged and deployed on **Hugging Face Spaces**. You can run the live demo, upload X-rays, record voice notes, and test clinical queries directly via the cloud.
|
| 73 |
|
| 74 |
+
๐ **[Launch TB-Guard-XAI on Hugging Face](https://huggingface.co/spaces/mistral-hackaton-2026/TB-Guard-XAI)**
|
|
|
|
|
|
|
|
|
|
| 75 |
|
| 76 |
---
|
| 77 |
|
| 78 |
+
## ๐ ๏ธ Run It Locally
|
| 79 |
|
| 80 |
+
### 1. Setup & Install
|
| 81 |
Ensure you have Python 3.10+ installed.
|
| 82 |
```bash
|
| 83 |
+
git clone https://github.com/vignesh19032005/TB-Guard-XAI.git
|
| 84 |
cd TB-Guard-XAI
|
| 85 |
python -m venv venv
|
| 86 |
source venv/bin/activate # On Windows: venv\Scripts\activate
|
|
|
|
| 88 |
```
|
| 89 |
|
| 90 |
### 2. Environment Variables
|
| 91 |
+
You need your Mistral API key to run the active inference pipelines. Create a `.env` file in the root directory:
|
| 92 |
```env
|
| 93 |
MISTRAL_API_KEY=your_mistral_key_here
|
| 94 |
```
|
| 95 |
|
| 96 |
+
### 3. Run the Local FastApi Server
|
| 97 |
```bash
|
| 98 |
python backend.py
|
| 99 |
```
|
| 100 |
+
*Open your browser to `http://127.0.0.1:8000` to access the full UI.*
|
| 101 |
+
|
| 102 |
+
---
|
| 103 |
+
|
| 104 |
+
## ๐ฎ What's Next for TB-Guard-XAI?
|
| 105 |
+
- **Automated PACS Watcher:** We are actively building an offline background folder-watcher to automatically ingest and triage batch X-Rays dumped from hospital local drives.
|
| 106 |
+
- **Continuous Learning Loop:** Implementing human-in-the-loop validation where physicians can correct Mistral via the UI, feeding the verified data back into the underlying ensemble.
|
| 107 |
+
- **DICOM Support:** Transitioning from PNG parsing to native HL7/DICOM medical file support for true hospital system interoperability.
|
| 108 |
|
| 109 |
---
|
| 110 |
|
| 111 |
+
### ๐ Clinical Disclaimer
|
| 112 |
+
**Not for self-diagnosis.** TB-Guard-XAI is an experimental clinical decision-support tool built specifically for the **Mistral AI Worldwide Hackathon 2026** demonstration. It is designed to assist trained medical technicians as a primary triage filter. All positive and unsure results must lead to confirmatory Sputum Xpert MTB/RIF or culture tests in accordance with local WHO guidelines.
|
| 113 |
|
| 114 |
+
> *Built with โค๏ธ for Mistral AI. Code by Vignesh.*
|
TB-Guard-XAI.png
ADDED
|
Git LFS Details
|
qdrant_db/.lock
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
tmp lock file
|
qdrant_db/collection/tb_medical_knowledge/storage.sqlite
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5b21de16e725482e2ebacdae28a487df38a4ad66f63a8ce2baa7e083ebbc12e0
|
| 3 |
+
size 8118272
|
qdrant_db/meta.json
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
{"collections": {"tb_medical_knowledge": {"vectors": {"size": 1024, "distance": "Cosine", "hnsw_config": null, "quantization_config": null, "on_disk": null, "datatype": null, "multivector_config": null}, "shard_number": null, "sharding_method": null, "replication_factor": null, "write_consistency_factor": null, "on_disk_payload": null, "hnsw_config": null, "wal_config": null, "optimizers_config": null, "quantization_config": null, "sparse_vectors": null, "strict_mode_config": null, "metadata": null}}, "aliases": {}}
|