Spaces:
Sleeping
Sleeping
| # Medical Q&A Bot - System Architecture | |
| ## Visual Overview | |
| ``` | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β USER INTERFACE β | |
| β β | |
| β ββββββββββββββββββββββββ ββββββββββββββββββββββββ β | |
| β β Gradio Web UI β β Streamlit Web UI β β | |
| β β (app.py) β OR β (app_streamlit.py) β β | |
| β β Port: 7860 β β Port: 8501 β β | |
| β ββββββββββββ¬ββββββββββββ ββββββββββββ¬ββββββββββββ β | |
| βββββββββββββββΌβββββββββββββββββββββββββββββββββΌββββββββββββββββββ | |
| β β | |
| ββββββββββββββββββ¬ββββββββββββββββ | |
| β | |
| βΌ | |
| ββββββββββββββββββββββββββββββββββ | |
| β Query Processing Layer β | |
| β β | |
| β 1. Text Input Validation β | |
| β 2. Embedding Generation β | |
| β 3. Model Inference β | |
| ββββββββββββββ¬ββββββββββββββββββββ | |
| β | |
| βΌ | |
| ββββββββββββββββββββββββββββββββββ | |
| β CLASSIFIER MODULE β | |
| β (classifier/) β | |
| β β | |
| β ββββββββββββββββββββββββββββ β | |
| β β SentenceTransformer β β | |
| β β Embedding Model β β | |
| β βββββββββββββ¬βββββββββββββββ β | |
| β β β | |
| β βΌ β | |
| β ββββββββββββββββββββββββββββ β | |
| β β Classification Head β β | |
| β β (Neural Network) β β | |
| β βββββββββββββ¬βββββββββββββββ β | |
| ββββββββββββββββΌββββββββββββββββββ | |
| β | |
| ββββββββββββ΄βββββββββββ | |
| β β | |
| ββββββββββΌβββββββββ βββββββββΌβββββββββ | |
| β MEDICAL β β ADMINISTRATIVEβ | |
| β QUERY β β QUERY β | |
| ββββββββββ¬βββββββββ βββββββββ¬βββββββββ | |
| β β | |
| β ββββΊ End (No Retrieval) | |
| β | |
| βΌ | |
| βββββββββββββββββββββββββββββββββββ | |
| β RETRIEVAL MODULE β | |
| β (retriever/) β | |
| β β | |
| β ββββββββββββββββββββββββββ β | |
| β β BM25 Search β β | |
| β β (Sparse Retrieval) β β | |
| β βββββββββββββ¬βββββββββββββ β | |
| β β β | |
| β βββββββββββββΌβββββββββββββ β | |
| β β Dense Search β β | |
| β β (Vector Similarity) β β | |
| β βββββββββββββ¬βββββββββββββ β | |
| β β β | |
| β βββββββββββββΌβββββββββββββ β | |
| β β RRF Fusion β β | |
| β β (Rank Combination) β β | |
| β βββββββββββββ¬βββββββββββββ β | |
| β β β | |
| β βββββββββββββΌβββββββββββββ β | |
| β β Optional Reranker β β | |
| β β (Cross-Encoder) β β | |
| β βββββββββββββ¬βββββββββββββ β | |
| ββββββββββββββββΌββββββββββββββββββ | |
| β | |
| βΌ | |
| βββββββββββββββββββββββββ | |
| β DATA SOURCES β | |
| β β | |
| β β’ PubMed Articles β | |
| β β’ Miriad Q&A β | |
| β β’ UniDoc Q&A β | |
| β β | |
| β (data/corpora/) β | |
| βββββββββββββ¬ββββββββββββ | |
| β | |
| βΌ | |
| βββββββββββββββββββββββββ | |
| β RESULTS β | |
| β β | |
| β β’ Document Title β | |
| β β’ Text Content β | |
| β β’ Relevance Scores β | |
| β β’ Metadata β | |
| βββββββββββββ¬ββββββββββββ | |
| β | |
| βΌ | |
| βββββββββββββββββββββββββ | |
| β UI DISPLAY β | |
| β β | |
| β β’ Formatted Cards β | |
| β β’ JSON View β | |
| β β’ Score Badges β | |
| βββββββββββββββββββββββββ | |
| ``` | |
| ## Data Flow | |
| ### 1. User Input | |
| ``` | |
| User Types Query β Web Interface Captures Input β Sends to Backend | |
| ``` | |
| ### 2. Classification Phase | |
| ``` | |
| Query Text | |
| β | |
| Sentence Transformer (Embedding) | |
| β | |
| Classification Head (Neural Network) | |
| β | |
| Output: [Medical | Administrative | Other] + Confidence Scores | |
| ``` | |
| ### 3. Retrieval Phase (Medical Queries Only) | |
| ``` | |
| Medical Query | |
| β | |
| ββββββββββββββββββββββββββ | |
| β Parallel Retrieval β | |
| β βββββββββββββββββββ β | |
| β β BM25 (Sparse) β β β Top 100 docs | |
| β βββββββββββββββββββ β | |
| β βββββββββββββββββββ β | |
| β β Dense (Vector) β β β Top 100 docs | |
| β βββββββββββββββββββ β | |
| ββββββββββββββββββββββββββ | |
| β | |
| RRF Fusion Algorithm | |
| β | |
| Top K Candidates | |
| β | |
| Optional: Cross-Encoder Reranking | |
| β | |
| Final Top N Results | |
| ``` | |
| ## Technology Stack | |
| ### Frontend | |
| - **Gradio** - Primary UI framework | |
| - **Streamlit** - Alternative UI framework | |
| - **HTML/CSS** - Custom styling | |
| - **JavaScript** - Auto-generated by frameworks | |
| ### Backend | |
| - **Python 3.8+** - Core language | |
| - **PyTorch** - Deep learning framework | |
| - **Sentence-Transformers** - Embedding models | |
| - **scikit-learn** - ML utilities | |
| ### Search & Retrieval | |
| - **Rank-BM25** - Sparse retrieval | |
| - **FAISS** - Dense vector search | |
| - **Custom RRF** - Rank fusion | |
| - **Cross-Encoder** - Optional reranking | |
| ### Data | |
| - **PubMed** - Medical research articles | |
| - **Miriad** - Medical Q&A database | |
| - **UniDoc** - Unified document corpus | |
| - **JSONL** - Data storage format | |
| ## Component Interactions | |
| ### 1. Initialization | |
| ```python | |
| # Load models once at startup | |
| embedding_model, classifier = classifier_init() | |
| ``` | |
| ### 2. Classification | |
| ```python | |
| classification = predict_query( | |
| text=[query], | |
| embedding_model=embedding_model, | |
| classifier_head=classifier | |
| ) | |
| ``` | |
| ### 3. Retrieval | |
| ```python | |
| hits = get_candidates( | |
| query=query, | |
| k_retrieve=10, | |
| use_reranker=False | |
| ) | |
| ``` | |
| ### 4. Display | |
| ```python | |
| # Gradio displays results in tabs | |
| # - Formatted HTML view | |
| # - Raw JSON view | |
| ``` | |
| ## Performance Characteristics | |
| ### Speed | |
| - **Classification**: ~100-500ms | |
| - **BM25 Search**: ~50-200ms | |
| - **Dense Search**: ~100-300ms | |
| - **Reranking**: ~500-2000ms (if enabled) | |
| ### Accuracy | |
| - **Classification**: ~95% accuracy | |
| - **Retrieval**: Depends on corpus and query | |
| - **Reranking**: +5-10% improvement | |
| ### Resource Usage | |
| - **Memory**: ~2-4 GB (with models loaded) | |
| - **CPU**: Moderate during inference | |
| - **GPU**: Optional (speeds up inference) | |
| ## Scalability Considerations | |
| ### Current Setup (Single User) | |
| - β Perfect for demos and development | |
| - β Low latency | |
| - β Easy to debug | |
| ### Future Scaling Options | |
| - π Add caching for common queries | |
| - π Deploy on cloud with autoscaling | |
| - π Use model quantization for faster inference | |
| - π Implement request queuing | |
| - π Add load balancing | |
| ## Security & Privacy | |
| ### Current Implementation | |
| - Local hosting only | |
| - No data persistence | |
| - No user tracking | |
| - No authentication (optional) | |
| ### Production Considerations | |
| - Add user authentication | |
| - Implement rate limiting | |
| - Sanitize inputs | |
| - Log access for auditing | |
| - HTTPS for encrypted communication | |
| ## Monitoring & Debugging | |
| ### Available Information | |
| - Query classification results | |
| - Confidence scores per category | |
| - Retrieval scores (BM25, Dense, RRF) | |
| - Document metadata | |
| - Error messages | |
| ### Debug Mode | |
| ```python | |
| # In app.py, set: | |
| demo.launch(show_error=True) # Shows detailed errors | |
| ``` | |
| ## Deployment Options | |
| ### 1. Local (Current) | |
| ``` | |
| Pros: Easy, fast, secure | |
| Cons: Single user, not accessible remotely | |
| ``` | |
| ### 2. Hugging Face Spaces | |
| ``` | |
| Pros: Free, easy deploy, public URL | |
| Cons: Limited resources, public access | |
| ``` | |
| ### 3. Cloud (AWS/GCP/Azure) | |
| ``` | |
| Pros: Scalable, private, customizable | |
| Cons: Costs money, requires setup | |
| ``` | |
| ### 4. Docker Container | |
| ``` | |
| Pros: Portable, consistent environment | |
| Cons: Requires Docker knowledge | |
| ``` | |
| ## File Structure | |
| ``` | |
| health-query-classifier/ | |
| βββ π₯οΈ UI Layer | |
| β βββ app.py # Main Gradio UI | |
| β βββ app_streamlit.py # Alternative Streamlit UI | |
| β βββ launch_ui.bat # Windows launcher | |
| β βββ launch_ui.ps1 # PowerShell launcher | |
| β | |
| βββ π§ Classifier Layer | |
| β βββ classifier/ | |
| β β βββ infer.py # Inference logic | |
| β β βββ head.py # Classification head | |
| β β βββ train.py # Training script | |
| β β βββ utils.py # Utilities | |
| β | |
| βββ π Retrieval Layer | |
| β βββ retriever/ | |
| β β βββ search.py # Search interface | |
| β β βββ index_bm25.py # BM25 indexing | |
| β β βββ index_dense.py # Dense indexing | |
| β β βββ rrf.py # Rank fusion | |
| β | |
| βββ π₯ Team Layer | |
| β βββ team/ | |
| β β βββ candidates.py # Candidate retrieval | |
| β β βββ interfaces.py # Data interfaces | |
| β | |
| βββ π Data Layer | |
| β βββ data/ | |
| β β βββ corpora/ # Corpus files | |
| β β βββ medical_qa.jsonl | |
| β β βββ miriad_text.jsonl | |
| β β βββ unidoc_qa.jsonl | |
| β | |
| βββ π Documentation | |
| βββ README.md # Main documentation | |
| βββ QUICKSTART.md # Quick start guide | |
| βββ UI_README.md # UI documentation | |
| βββ UI_IMPLEMENTATION.md # Implementation details | |
| βββ ARCHITECTURE.md # This file | |
| ``` | |
| --- | |
| This architecture ensures: | |
| - β Clean separation of concerns | |
| - β Modular design | |
| - β Easy to test and debug | |
| - β Scalable and maintainable | |
| - β Well-documented | |