Vignesh-19 commited on
Commit
d60430e
ยท
verified ยท
1 Parent(s): 01c4b90

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -5,5 +5,7 @@ docs/WHO_Diagnosis_Guidelines.pdf filter=lfs diff=lfs merge=lfs -text
5
  docs/WHO_Guidelines.pdf filter=lfs diff=lfs merge=lfs -text
6
  docs/WHO_TB_2025.pdf filter=lfs diff=lfs merge=lfs -text
7
  docs/WHO_TB_Screening_Module2_2021.pdf filter=lfs diff=lfs merge=lfs -text
 
8
  static/demo/complex.png filter=lfs diff=lfs merge=lfs -text
9
  static/demo/healthy.png filter=lfs diff=lfs merge=lfs -text
 
 
5
  docs/WHO_Guidelines.pdf filter=lfs diff=lfs merge=lfs -text
6
  docs/WHO_TB_2025.pdf filter=lfs diff=lfs merge=lfs -text
7
  docs/WHO_TB_Screening_Module2_2021.pdf filter=lfs diff=lfs merge=lfs -text
8
+ qdrant_db/collection/tb_medical_knowledge/storage.sqlite filter=lfs diff=lfs merge=lfs -text
9
  static/demo/complex.png filter=lfs diff=lfs merge=lfs -text
10
  static/demo/healthy.png filter=lfs diff=lfs merge=lfs -text
11
+ TB-Guard-XAI.png filter=lfs diff=lfs merge=lfs -text
.gitignore CHANGED
@@ -22,7 +22,7 @@ temp_uploads/
22
  archive/
23
 
24
  # Qdrant DB
25
- qdrant_db/
26
 
27
  # IDE
28
  .vscode/
 
22
  archive/
23
 
24
  # Qdrant DB
25
+ # qdrant_db/ (Commented out so it uploads to Hugging Face)
26
 
27
  # IDE
28
  .vscode/
README.md CHANGED
@@ -1,57 +1,86 @@
1
- ---
2
- title: TB Guard XAI
3
- emoji: ๐Ÿฅ
4
- colorFrom: blue
5
- colorTo: indigo
6
- sdk: docker
7
- pinned: false
8
- ---
9
  # ๐Ÿซ TB-Guard-XAI: Explainable AI Triage for Mass Tuberculosis Screening
10
 
11
- **Mistral AI Worldwide Hackathon 2026 Submission**
12
 
13
- TB-Guard-XAI is an Explainable AI triage and clinical decision-support pipeline designed to automate mass Tuberculosis (TB) screening in low-resource, high-burden environments where trained radiologists are scarce.
14
 
15
- Instead of acting as a "black box" that outputs a simple percentage, TB-Guard orchestrates a multi-modal ensemble of **Mistral's Latest AI Models** and **PyTorch Deep Learning** to provide robust, explainable, and localized clinical screening.
 
 
16
 
17
- ![TB-Guard-XAI Dashboard](demo_dashboard_placeholder.png) <!-- *Replace with actual screenshot of your beautiful UI before submitting!* -->
18
 
19
  ---
20
 
21
- ## ๐Ÿš€ The Problem & The Mission
22
- Tuberculosis kills 1.3 million people annually, with 87% of cases occurring in low-resource settings. The WHO explicitly endorses AI-assisted Chest X-Ray (CXR) screening to bridge the massive gap in healthcare personnel. However, existing AI models are often "black boxes" that clinicians cannot trust.
23
 
24
- **Our mission:** Build an AI system that knows *why* it made a decision, knows *when* it is uncertain, and uses Mistral's advanced reasoning to explain its thoughts just like a human doctor would.
 
 
25
 
26
  ---
27
 
28
- ## ๐Ÿง  Mistral AI Integration (The Core Stack)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
 
30
- TB-Guard-XAI is deeply integrated into the Mistral ecosystem, utilizing distinct models for specialized clinical agents:
31
 
32
- 1. **Mistral Vision (mistral-large-latest)**: Acts as the "Second Opinion Expert." It physically looks at the raw patient X-Ray, compares its findings against the PyTorch Deep Learning coordinates, and verifies abnormalities (Cavities, Opacities).
33
- 2. **Mistral Audio (voxtral-mini-latest)**: Powers the chaotic-environment input. Technicians can dictate symptoms via voice recording, and Voxtral instantly transcribes the clinical context without a keyboard.
34
- 3. **Mistral Triage Agent (mistral-small-latest)**: An ultra-fast safety validator that intercepts Voxtral voice input and instantly rejects queries/symptoms that are unrelated to respiratory illnesses.
35
- 4. **Mistral RAG Reasoner (mistral-large-latest)**: Utilizes native tool-calling to query a **Qdrant Vector Database** containing indexed WHO TB Guidelines, generating an evidence-backed, structured Clinical Explainer Report.
36
- 5. **MedGemma Safety Guardrails**: Validates the end chatbot responses to ensure clinical safety compliance.
 
 
37
 
38
  ---
39
 
40
- ## ๐Ÿ”ฌ Mathematical Architecture (Beyond the LLM)
 
41
 
42
- To ensure true clinical safety, we built a bespoke Machine Learning pipeline beneath Mistral:
43
- * **PyTorch CNN Ensemble:** Combines DenseNet121, EfficientNet-B4, and ResNet50 trained across 6 global datasets (Shenzhen, Montgomery, etc.).
44
- * **Monte Carlo Dropout (Bayesian Uncertainty):** The neural network runs predictions 20 times per image. If it hallucinates or guesses, the mathematical variance flags the result as **"Unreliable โ€” Review Required"**, overriding the base percentage.
45
- * **Grad-CAM Heatmaps:** Generates a topological color map precisely showing the physician exactly *where* the AI found the infection in the lung.
46
 
47
  ---
48
 
49
- ## ๐Ÿ› ๏ธ How to Run Locally
50
 
51
- ### 1. Requirements
52
  Ensure you have Python 3.10+ installed.
53
  ```bash
54
- git clone https://github.com/your-username/TB-Guard-XAI.git
55
  cd TB-Guard-XAI
56
  python -m venv venv
57
  source venv/bin/activate # On Windows: venv\Scripts\activate
@@ -59,20 +88,27 @@ pip install -r requirements.txt
59
  ```
60
 
61
  ### 2. Environment Variables
62
- You need your Mistral API key to run the clinical reasoning layers. Create a `.env` file in the root directory:
63
  ```env
64
  MISTRAL_API_KEY=your_mistral_key_here
65
  ```
66
 
67
- ### 3. Run the Server
68
  ```bash
69
  python backend.py
70
  ```
71
- *Open your browser to `http://127.0.0.1:8000` to access the Clinical Dashboard.*
 
 
 
 
 
 
 
72
 
73
  ---
74
 
75
- ## ๐Ÿ“‘ Clinical Disclaimer
76
- **Not for self-diagnosis.** TB-Guard-XAI is an experimental clinical decision-support tool built for hackathon demonstration. It is designed to assist trained medical technicians as a primary triage filter. All positive results must be confirmed via Sputum Xpert MTB/RIF or culture tests in accordance with WHO guidelines.
77
 
78
- > Built with โค๏ธ for the Mistral AI Hackathon 2026.
 
 
 
 
 
 
 
 
 
1
  # ๐Ÿซ TB-Guard-XAI: Explainable AI Triage for Mass Tuberculosis Screening
2
 
3
+ **Built for the Mistral AI Worldwide Hackathon 2026**
4
 
5
+ > TB-Guard-XAI is an explainable, multimodal clinical triage engine. Uniting PyTorch deep learning with Mistral, Bayesian Uncertainty mathematically detects AI "guessing," while Grad-CAM heatmaps highlight infections. Mistral Vision adds a 2nd opinion, Voxtral transcribes voice, and RAG outputs MedGemma-safe, WHO-backed clinical reports.
6
 
7
+ [![Hugging Face Space](https://img.shields.io/badge/๐Ÿค—_Space-Live_Demo-blue)](https://huggingface.co/spaces/mistral-hackaton-2026/TB-Guard-XAI)
8
+ [![Demo Video](https://img.shields.io/badge/๐ŸŽฌ_Video-Watch_Pitch-red)](https://youtu.be/UyxZCp2q7TM)
9
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
10
 
11
+ ![TB-Guard-XAI Dashboard](https://github.com/vignesh19032005/TB-Guard-XAI/blob/de74fe2548342ca0c66d0f3771885d07c112c042/TB-Guard-XAI.png)
12
 
13
  ---
14
 
15
+ ## ๐Ÿš€ The Clinical Problem
16
+ Tuberculosis kills 1.3 million people annually, with 87% of cases occurring in low-resource settings. The WHO explicitly endorses AI-assisted Chest X-Ray (CXR) screening to bridge the massive gap in healthcare personnel.
17
 
18
+ **The Flaw in Current AI:** Existing medical AI models are *"black boxes"*. They output a rigid probability (e.g., "95% TB") using standard softmax functions. This results in **false overconfidence**. If given an obscure anomaly, traditional AI will confidently hallucinate a diagnosis because it lacks the mathematical capacity to say, *"I don't know."* Furthermore, they provide no explanation for *why* they made the decision, making them unsafe for autonomous triage.
19
+
20
+ **Our Mission:** Build an AI system that knows *why* it made a decision, mathematically calculates *when* it is out of its depth, and orchestrates the Mistral AI ecosystem to explain its reasoning exactly as a human doctor would.
21
 
22
  ---
23
 
24
+ ## ๐Ÿง  The Architecture & Tech Stack Justification
25
+ TB-Guard-XAI is not a simple wrapper around an LLM. It is a highly engineered, multi-agent pipeline bridging deterministic Deep Learning with non-deterministic Generative AI.
26
+
27
+ ### 1. The Mistral AI Ecosystem (The Brains)
28
+ We utilized almost the entire suite of Mistral's latest models, assigning them specialized agentic roles:
29
+
30
+ * **๐Ÿ‘๏ธ Mistral Vision (`mistral-large-latest`): The Second Opinion.**
31
+ * *Why this?* Instead of relying solely on our PyTorch CNN, we pass the compressed X-Ray directly to Mistral Large. It acts as an independent radiologist, cross-verifying the mathematical coordinates found by PyTorch and hunting for contextual clues like Lymphadenopathy or Cavitations.
32
+ * **๐ŸŽ™๏ธ Voxtral Audio (`voxtral-mini-latest`): Acoustic Context.**
33
+ * *Why this?* Rural clinics are chaotic. Technicians don't have time to type. Voxtral ingests spoken symptoms ("Patient has night sweats") and transcribes them instantly.
34
+ * **๐Ÿ›ก๏ธ Mistral Router (`mistral-small-latest`): The Safety Gatekeeper.**
35
+ * *Why this?* We use Mistral Small for zero-latency, ultra-cheap intent classification. It intercepts the transcribed voice notes. If a patient describes a broken ankle, Mistral Small instantly blocks the query for violating the Respiratory domain, preserving clinical compliance.
36
+ * **๐Ÿ“š Mistral RAG Reasoner (`mistral-large-latest`): Clinical Synthesis.**
37
+ * *Why this?* Mistral Large possesses exceptional native tool-calling. It dynamically queries our Qdrant Vector Database (loaded with WHO TB Guidelines) and fuses the RAG evidence, Mistral Vision's visual assessment, and PyTorch's mathematical probabilities into a cohesive, structured Medical Report.
38
+ * **โš–๏ธ MedGemma: End-of-Line Validation.**
39
+ * *Why this?* Used as a secondary open-weight safety validator to ensure the final generated advice does not provide definitive medical diagnoses, keeping the tool strictly as "Decision Support."
40
+
41
+ ### 2. The Deep Learning Engine (The Eyes & The Math)
42
+ Beneath the LLMs lies a robust computer vision pipeline designed for maximum explainability.
43
+
44
+ * **Convoluted Neural Network (CNN) Ensemble**
45
+ * *What is it?* A parallel architecture fusing DenseNet121, EfficientNet-B4, and ResNet50.
46
+ * *Why average them?* Single models inherit inherent dataset biases. By ensembling three distinct architectures, we eliminate distinct blind spots. Furthermore, they are trained on 6 distinct global CXR datasets (Shenzhen, Montgomery, etc.) to ensure ethnic and anatomical generalization.
47
+ * **Bayesian Deep Learning: Monte Carlo (MC) Dropout**
48
+ * *What is it?* The crown jewel of our safety mechanism. Standard AI evaluates an image once. MC Dropout forces our neural network to evaluate the same X-Ray **20 different times**, randomly turning off ("dropping out") different neurons during each pass.
49
+ * *Why use it?* If the model is recognizing true TB features, the 20 predictions will be nearly identical (Low Variance). But if the model is guessing on an anomalous image, the 20 predictions will wildly disagree (High Variance). When high variance is detected, the system overrides the probability and flags **"Unreliable โ€” Human Review Required,"** legally protecting the clinic from false AI confidence.
50
+ * **Explainable AI: Grad-CAM (Gradient-weighted Class Activation Mapping)**
51
+ * *What is it?* An algorithm that traces the classification logic backwards through the CNN to find exactly which pixels activated the "Tuberculosis" neurons.
52
+ * *Why use it?* It generates a topological heatmap over the X-Ray. Doctors don't have to trust the AI blindly; they can physically see exactly what the AI is looking at.
53
+
54
+ ### 3. The Infrastructure Pipeline
55
+ * **FastAPI (Backend):** Chosen over Flask/Django for its asynchronous performance capability, crucial for handling concurrent PyTorch inference, Mistral tool-calling, and Audio processing simultaneously.
56
+ * **Qdrant (Vector Database):** Chosen over Pinecone/Milvus for its incredible local-deployment capability and dense vector search speeds, serving our WHO RAG context instantly.
57
+ * **Vanilla HTML/JS + Tailwind (Frontend):** We specifically avoided heavy React/Next.js frameworks to guarantee the UI could run on extremely low-end, low-RAM hospital registry computers with zero dependency bloat.
58
 
59
+ ---
60
 
61
+ ## ๐Ÿ’ก Key Features at a Glance
62
+ * **Drag-and-Drop X-Ray Analysis** with Native Bayesian Uncertainty bounds.
63
+ * **Mistral Vision Multimodal Verification** natively embedded in the UI.
64
+ * **Voice-Activated Clinical Context** powered by Voxtral.
65
+ * **Grad-CAM Topological Visualizations.**
66
+ * **Built-in AI Respiratory Chatbot.**
67
+ * **One-Click Printable PDF Triage Reports** for lab handover.
68
 
69
  ---
70
 
71
+ ## ๐ŸŒ Live Deployment
72
+ TB-Guard-XAI is packaged and deployed on **Hugging Face Spaces**. You can run the live demo, upload X-rays, record voice notes, and test clinical queries directly via the cloud.
73
 
74
+ ๐Ÿ”— **[Launch TB-Guard-XAI on Hugging Face](https://huggingface.co/spaces/mistral-hackaton-2026/TB-Guard-XAI)**
 
 
 
75
 
76
  ---
77
 
78
+ ## ๐Ÿ› ๏ธ Run It Locally
79
 
80
+ ### 1. Setup & Install
81
  Ensure you have Python 3.10+ installed.
82
  ```bash
83
+ git clone https://github.com/vignesh19032005/TB-Guard-XAI.git
84
  cd TB-Guard-XAI
85
  python -m venv venv
86
  source venv/bin/activate # On Windows: venv\Scripts\activate
 
88
  ```
89
 
90
  ### 2. Environment Variables
91
+ You need your Mistral API key to run the active inference pipelines. Create a `.env` file in the root directory:
92
  ```env
93
  MISTRAL_API_KEY=your_mistral_key_here
94
  ```
95
 
96
+ ### 3. Run the Local FastApi Server
97
  ```bash
98
  python backend.py
99
  ```
100
+ *Open your browser to `http://127.0.0.1:8000` to access the full UI.*
101
+
102
+ ---
103
+
104
+ ## ๐Ÿ”ฎ What's Next for TB-Guard-XAI?
105
+ - **Automated PACS Watcher:** We are actively building an offline background folder-watcher to automatically ingest and triage batch X-Rays dumped from hospital local drives.
106
+ - **Continuous Learning Loop:** Implementing human-in-the-loop validation where physicians can correct Mistral via the UI, feeding the verified data back into the underlying ensemble.
107
+ - **DICOM Support:** Transitioning from PNG parsing to native HL7/DICOM medical file support for true hospital system interoperability.
108
 
109
  ---
110
 
111
+ ### ๐Ÿ“‘ Clinical Disclaimer
112
+ **Not for self-diagnosis.** TB-Guard-XAI is an experimental clinical decision-support tool built specifically for the **Mistral AI Worldwide Hackathon 2026** demonstration. It is designed to assist trained medical technicians as a primary triage filter. All positive and unsure results must lead to confirmatory Sputum Xpert MTB/RIF or culture tests in accordance with local WHO guidelines.
113
 
114
+ > *Built with โค๏ธ for Mistral AI. Code by Vignesh.*
TB-Guard-XAI.png ADDED

Git LFS Details

  • SHA256: 72daf71c48f53b932bec67ffdb4ea1f49261de6edec4a60be964316303b828c9
  • Pointer size: 131 Bytes
  • Size of remote file: 369 kB
qdrant_db/.lock ADDED
@@ -0,0 +1 @@
 
 
1
+ tmp lock file
qdrant_db/collection/tb_medical_knowledge/storage.sqlite ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5b21de16e725482e2ebacdae28a487df38a4ad66f63a8ce2baa7e083ebbc12e0
3
+ size 8118272
qdrant_db/meta.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"collections": {"tb_medical_knowledge": {"vectors": {"size": 1024, "distance": "Cosine", "hnsw_config": null, "quantization_config": null, "on_disk": null, "datatype": null, "multivector_config": null}, "shard_number": null, "sharding_method": null, "replication_factor": null, "write_consistency_factor": null, "on_disk_payload": null, "hnsw_config": null, "wal_config": null, "optimizers_config": null, "quantization_config": null, "sparse_vectors": null, "strict_mode_config": null, "metadata": null}}, "aliases": {}}