taraky's picture
Upload folder using huggingface_hub
b7f3196 verified

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

Summary: Available UI Options for Medical Q&A Bot

🎯 Three Versions Created

1. app_demo.py ⚑ RECOMMENDED FOR DEMOS

Port: 7863 Speed: Instant (<1 second) Features:

  • βœ… Real-time classification (medical vs administrative)
  • βœ… Confidence scores with visualization
  • βœ… Action recommendations
  • βœ… Uses your group's trained models
  • ❌ No document retrieval (for speed)

Best for:

  • Class presentations
  • Quick demonstrations
  • Testing classification accuracy
  • When time is limited

Run: python app_demo.py


2. app_full.py πŸ”¬ COMPLETE SYSTEM

Port: 7864 Speed: First query: 6-10 minutes, subsequent: 2-5 seconds Features:

  • βœ… Real-time classification
  • βœ… Full document retrieval from PubMed & Miriad
  • βœ… BM25 + Dense search + RRF fusion
  • βœ… Optional cross-encoder reranking
  • ⚠️ Very slow first initialization

Best for:

  • Showing full system capabilities
  • When you have 10+ minutes to wait
  • Detailed technical demonstrations
  • Proving retrieval works

Run: python app_full.py

⚠️ WARNING: First medical query takes 6-10 minutes because:

  • Loads ~200MB+ of medical corpus data
  • Builds BM25 keyword index
  • Generates embeddings for ALL documents (this is the slow part)
  • Builds FAISS vector index

3. app.py / app_safe.py / app_lightweight.py πŸ”§ EXPERIMENTAL

These were intermediate versions created while troubleshooting. Not recommended for use.


🎬 Recommendation for Your Group Presentation

Strategy 1: Fast Demo (5 minutes)

Use app_demo.py only:

  1. Show classification working instantly
  2. Test medical vs administrative queries
  3. Highlight confidence scores
  4. Explain that retrieval is available but disabled for demo speed
  5. Show the codebase that supports retrieval (team/candidates.py)

Advantage: Reliable, professional, no waiting


Strategy 2: Split Demo (15+ minutes)

Use BOTH versions:

Part 1: Use app_demo.py for quick classification demos (5 min)

  • Show multiple queries rapidly
  • Demonstrate accuracy

Part 2: Switch to app_full.py that you pre-initialized (10 min)

  • Before presentation: Run app_full.py and make ONE medical query to initialize
  • Wait the 10 minutes for it to build indexes
  • Keep it running
  • During presentation: Show actual document retrieval working fast

Advantage: Shows both speed AND capabilities


Strategy 3: Video Backup

  1. Use app_demo.py for live demo
  2. Record a video of app_full.py working with retrieval
  3. Show video during presentation if needed

πŸ“Š Technical Details to Mention

Your Group's Implementation:

  • Classification Model: Fine-tuned sentence-transformers (embeddinggemma-300m-medical)
  • Hybrid Retrieval: BM25 (sparse) + Dense embeddings (semantic)
  • Fusion Algorithm: Reciprocal Rank Fusion (RRF)
  • Data Sources: PubMed Medical Q&A + Miriad corpus
  • Optional Enhancement: Cross-encoder reranker for accuracy

Why Retrieval is Slow:

  • Real ML systems need to index large datasets
  • Your corpus has thousands of medical documents
  • CPU-only inference (no GPU acceleration available)
  • This is a REAL implementation, not a toy demo

Production Solutions:

  • Pre-build and save indexes (don't rebuild each time)
  • Use GPU for faster embedding
  • Implement caching
  • Deploy on cloud with more resources

πŸ’‘ Demo Script Suggestions

Opening (30 seconds):

"We built an AI system that automatically classifies patient queries and retrieves relevant medical research. Let me show you how it works..."

Classification Demo (2-3 minutes):

"First, our classification system determines if a query is medical or administrative..." [Use app_demo.py, try 3-4 different queries]

Technical Explanation (2 minutes):

"Under the hood, we use:

  • A fine-tuned 300-million parameter transformer model
  • Hybrid search combining keyword matching and semantic similarity
  • Reciprocal Rank Fusion to combine results
  • Medical corpora from PubMed and Miriad databases"

Show Retrieval (Optional, if pre-initialized):

"Now let me show you actual document retrieval..." [Use app_full.py if you pre-initialized it]

Closing (30 seconds):

"This demonstrates how AI can improve healthcare triage, reduce response times, and provide evidence-based information to both patients and providers."


πŸš€ Quick Start Commands

# For demos and presentations
python app_demo.py
# Access at: http://127.0.0.1:7863

# For full system (wait 10 minutes after first query)
python app_full.py
# Access at: http://127.0.0.1:7864

βœ… What You Successfully Built

  1. βœ… Working web UI with professional design
  2. βœ… Real-time classification using your trained model
  3. βœ… Full retrieval system integrated
  4. βœ… Two versions: fast demo + complete system
  5. βœ… Comprehensive documentation
  6. βœ… Example queries
  7. βœ… Clear visualization of results

You have everything you need for a successful presentation!


🎯 Final Recommendation

For your presentation, use app_demo.py

It shows your ML work instantly and professionally. You can explain:

  • "The classification happens in real-time"
  • "The full system includes retrieval which we can show separately"
  • "This demonstrates the core AI capability"

If anyone asks about retrieval, you can:

  • Show the code in team/candidates.py
  • Explain the hybrid search architecture
  • Mention it's fully implemented but slow due to index building

This is the smart approach for a live demo!