Spaces:

sniro23
/

VedaMD-Backend-v2

Sleeping

App Files Files Community

sniro23 commited on Jul 29, 2025

Commit

33cb5fa

1 Parent(s): 8bbc7a9

feat: Enable Gradio API for frontend connectivity

Browse files

Files changed (3) hide show

docs/scratchpad.md +201 -53
frontend/src/app/page.tsx +6 -21
frontend/src/lib/api.ts +8 -6

docs/scratchpad.md CHANGED Viewed

@@ -1,65 +1,213 @@
-# Scratchpad
-This document is a running log of the high-level tasks and the current focus.
-## Current Active Implementation Plan
-- **File:** `docs/implementation-plan/rag-quality-enhancement.md`
-- **Goal:** 🏥 **MEDICAL RAG ENHANCEMENT** - Enhanced medical context preparation + verification layers with medical-grade safety protocols
-- **Status:** ✅ **PHASE 1 COMPLETED** | ✅ **PHASE 2 COMPLETED SUCCESSFULLY** | 🚀 **READY FOR PHASE 3**
-- **Strategic Success**: Enhanced Medical RAG System with strict safety protocols now fully operational
-- **Phase 1 Results**:
-  - ✅ Clinical ModernBERT: 60.3% medical domain improvement, 768-dim embeddings
-  - ✅ Enhanced PDF Processing: Unstructured hi_res validated, clinical terminology preserved
-  - ✅ Llama3-70B via Groq API: Superior instruction following with medical-grade context adherence
-  - ✅ Resource Efficient: ~2GB local VRAM + proven medical safety protocols
-- **Phase 2 Results - COMPLETED SUCCESSFULLY**:
-  - ✅ **Task 2.1**: Enhanced Medical Context Preparation - Medical entity extraction operational (1-6 entities per document)
-  - ✅ **Task 2.2**: Medical Response Verification Layer - 100% source traceability and medical safety validation
-  - ✅ **Task 2.3**: Advanced Medical System Prompt - Clinical safety protocols active, vector compatibility resolved
-  - ✅ **Task 2.4**: Enhanced Medical Vector Store - Hybrid 384d + 768d Clinical ModernBERT architecture operational
-- **Integrated Medical RAG Performance**:
-  - ⚡ Processing Speed: 0.72-2.16s per query | 📚 5 enhanced documents per query | 🛡️ 100% SAFE responses
-  - 🔒 Medical Safety: 100% source traceability, comprehensive claim verification, strict context adherence
-  - 🏥 Clinical Enhancement: High medical similarity scores (0.7+), medical entity extraction, terminology enhancement
-- **Next Phase**: **PHASE 3 - Production Integration & Optimization**
-- **Next Action**: **PLANNER MODE** - Review Phase 2 achievements and plan Phase 3 production deployment strategy
----
-## Completed Implementation Plans
-- `docs/implementation-plan/stable-deployment-plan.md`
-- `docs/implementation-plan/web-ui-for-chatbot.md`
-- `docs/implementation-plan/maternal-health-rag-chatbot-v3.md`
----
-## Lessons Learned
-- **[2024-07-28]** The `groq` python client can have issues with proxies when running in certain environments (like Hugging Face Spaces). The fix is to instantiate a separate `httpx.Client` and pass it to the `groq.Groq` constructor to ensure it uses a clean, isolated network configuration.
-- **[2024-07-28]** When deploying Docker containers to services like Hugging Face Spaces, pay close attention to file ownership and permissions. The user running the application at runtime (`user`) may not be the same as the user that built the container (`root`). Ensure application directories, especially those used for caching (`HF_HOME`), are owned by the runtime user. Use `chown` in the Dockerfile to set permissions correctly.
-- **[2024-07-28]** Gradio's `gr.ChatInterface` expects the function to return a single string response. Returning a tuple or other data structure will cause a `ValidationError`.
-- **[2025-01-XX]** **Strategic Architecture Decision**: AI engineer's resource-friendly approach (Mistral 7B + LoRA) proved superior to large model approach (Me-LLaMA) for infrastructure-constrained environments. Specialized small models with domain fine-tuning often outperform generic large models in specific domains.
-- **[2025-01-XX]** **Medical PDF Processing**: Unstructured hi_res strategy is optimal for medical documents containing scanned PDFs, complex clinical tables, and multi-modal content. pdfplumber fails completely on scanned documents, making unstructured the only viable option for comprehensive medical document processing.
-- **[2025-01-XX]** **Medical Domain Embeddings**: Clinical ModernBERT provides significant advantages over general embeddings (BAAI/bge-large-en-v1.5) for medical concept representation with 8K context length (4x improvement) and clinical terminology understanding.
-- **[2025-01-XX]** **Resource Optimization**: Constraint-driven design often leads to better solutions. Working within 16GB VRAM limits forced optimization that resulted in a more maintainable, cost-effective, and deployable architecture than resource-intensive alternatives.
-- **[2025-01-XX]** **Phase 2 Medical Safety Architecture**: The hybrid approach combining enhanced medical context preparation + medical response verification + maintained Llama3-70B proved superior to model switching. This architecture achieves medical-grade safety (100% source traceability, comprehensive claim verification) while maintaining excellent performance (0.72-2.16s per query) and clinical enhancement (0.7+ similarity scores with Clinical ModernBERT).
-- **[2024-07-28]** For RAG, increasing the number of documents sent to the LLM (e.g., from 3 to 5) and using a very strict system prompt that forbids outside knowledge and mandates citations can significantly improve answer quality and reduce hallucinations.
-- **[2024-07-28]** Ensure document metadata is complete *during data creation*. If a `citation` field is missing, create a sensible default from the file path. This prevents "Unknown Source" issues downstream.
-- **[2024-07-28]** Enforce a strict, structured output format (e.g., using Markdown headings like `## Summary` and `## References`) via the system prompt to ensure consistent and professional-looking responses from the LLM.
-- **[2025-01-03]** Me-LLaMA models require PhysioNet credentialed health data use agreement and substantial computational resources (24GB+ VRAM for 13B model, 130GB+ for 70B). No commercial API providers currently offer Me-LLaMA access. For medical domain enhancement, Clinical ModernBERT embeddings (8K context) + smaller medical LLMs like medicine-Llama3-8B provide a more practical alternative with significant medical domain improvement while remaining infrastructure-compatible.
-## Planner Analysis - Medical Models Integration
-**STRATEGIC RECOMMENDATION**: ✅ **APPROVE with Modifications**
-### Key Insights:
-1. **Medical Domain Specialization**: Me-LLaMA + Clinical ModernBERT will significantly improve clinical relevance
-2. **Resource Challenge**: Me-LLaMA requires substantial compute - need deployment strategy before proceeding
-3. **Architecture Enhancement**: Medical models enable semantic understanding vs. basic text processing
-### Critical Decisions Required:
-- **Me-LLaMA Deployment**: Determine if we use API access, local deployment, or cloud service
-- **Compute Resources**: Assess if current infrastructure can handle medical model requirements
-- **Migration Strategy**: How to transition from current general-purpose pipeline to medical-specific one
-**NEXT STEP**: Executor should research Me-LLaMA deployment options and resource requirements before implementation begins.

+#!/usr/bin/env python3
+"""
+VedaMD Enhanced: Sri Lankan Clinical Assistant
+Main Gradio Application for Hugging Face Spaces Deployment
+Enhanced Medical-Grade RAG System with:
+✅ 5x Enhanced Retrieval (15+ documents vs previous 5)
+✅ Medical Entity Extraction & Clinical Terminology
+✅ Clinical ModernBERT (768d medical embeddings)
+✅ Medical Response Verification & Safety Protocols
+✅ Advanced Re-ranking & Coverage Verification
+✅ Source Traceability & Citation Support
+"""
+import os
+import logging
+import gradio as gr
+from typing import List, Dict, Optional
+import sys
+# Add src directory to path for imports
+sys.path.append(os.path.join(os.path.dirname(__file__), 'src'))
+from src.enhanced_groq_medical_rag import EnhancedGroqMedicalRAG, EnhancedMedicalResponse
+# Configure logging
+logging.basicConfig(
+    level=logging.INFO,
+    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
+)
+logger = logging.getLogger(__name__)
+# Initialize Enhanced Medical RAG System
+logger.info("🏥 Initializing VedaMD Enhanced for Hugging Face Spaces...")
+try:
+    enhanced_rag_system = EnhancedGroqMedicalRAG()
+    logger.info("✅ Enhanced Medical RAG system ready!")
+except Exception as e:
+    logger.error(f"❌ Failed to initialize system: {e}")
+    raise
+def process_enhanced_medical_query(message: str, history: List[List[str]]) -> str:
+    """
+    Process medical query with enhanced RAG system
+    """
+    try:
+        if not message.strip():
+            return "Please enter a medical question about Sri Lankan clinical guidelines."
+        # Convert Gradio chat history to our format
+        formatted_history = []
+        if history:
+            for chat_pair in history:
+                if len(chat_pair) >= 2:
+                    user_msg, assistant_msg = chat_pair[0], chat_pair[1]
+                    if user_msg:
+                        formatted_history.append({"role": "user", "content": user_msg})
+                    if assistant_msg:
+                        formatted_history.append({"role": "assistant", "content": assistant_msg})
+        # Get enhanced response
+        response: EnhancedMedicalResponse = enhanced_rag_system.query(
+            query=message,
+            history=formatted_history
+        )
+        # Format enhanced response for display
+        formatted_response = format_enhanced_medical_response(response)
+        return formatted_response
+    except Exception as e:
+        logger.error(f"Error processing query: {e}")
+        return f"⚠️ **System Error**: {str(e)}\n\nPlease try again or contact support if the issue persists."
+def format_enhanced_medical_response(response: EnhancedMedicalResponse) -> str:
+    """
+    Format the enhanced medical response for display, ensuring citations are always included.
+    """
+    formatted_parts = []
+    # Main response from the LLM
+    final_response_text = response.answer.strip()
+    formatted_parts.append(final_response_text)
+    # ALWAYS add the clinical sources section with clear numbering
+    if response.sources:
+        formatted_parts.append("\n\n---\n")
+        formatted_parts.append("### 📋 **Clinical Sources & Citations**")
+        formatted_parts.append("\nThis response is based on the following Sri Lankan clinical guidelines:")
+        # Create a numbered list of all sources used for the response
+        for i, source in enumerate(response.sources, 1):
+            # Make the citation number bold and add a clear label
+            formatted_parts.append(f"\n**[{i}]** Source: {source}")
+    # Enhanced information section with clear separation
+    formatted_parts.append("\n\n---\n")
+    formatted_parts.append("### 📊 **Response Analysis**")
+    # Safety and verification info with clearer formatting
+    if response.verification_result:
+        safety_status = "✅ SAFE" if response.safety_status == "SAFE" else "⚠️ CAUTION"
+        formatted_parts.append(f"\n**Medical Safety Status**: {safety_status}")
+        formatted_parts.append(f"**Verification Score**: {response.verification_result.verification_score:.1%}")
+        formatted_parts.append(f"**Verified Medical Claims**: {response.verification_result.verified_claims}/{response.verification_result.total_claims}")
+    # Enhanced retrieval metrics
+    formatted_parts.append(f"\n**Medical Information Coverage**:")
+    formatted_parts.append(f"- 🧠 Medical Entities: {response.medical_entities_count}")
+    formatted_parts.append(f"- 🎯 Context Adherence: {response.context_adherence_score:.1%}")
+    formatted_parts.append(f"- 📚 Guidelines Referenced: {len(response.sources)}")
+    # Always include processing time if available
+    if hasattr(response, 'query_time'):
+        formatted_parts.append(f"- ⚡ Processing Time: {response.query_time:.2f}s")
+    # Medical disclaimer with clear separation
+    formatted_parts.append("\n\n---\n")
+    formatted_parts.append("*⚕️ This information is derived from Sri Lankan clinical guidelines and is for reference only. Always consult with qualified healthcare professionals for patient care decisions.*")
+    return "\n".join(formatted_parts)
+def create_enhanced_medical_interface():
+    """
+    Create the enhanced Gradio interface for Hugging Face Spaces
+    """
+    # Custom CSS for medical theme
+    custom_css = """
+    .gradio-container {
+        max-width: 900px !important;
+        margin: auto !important;
+    }
+    .medical-header {
+        background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+        color: white;
+        padding: 20px;
+        border-radius: 10px;
+        margin-bottom: 20px;
+        text-align: center;
+    }
+    """
+    with gr.Blocks(
+        title="🏥 VedaMD Enhanced: Sri Lankan Clinical Assistant",
+        theme=gr.themes.Soft(),
+        css=custom_css
+    ) as demo:
+        # Header
+        gr.HTML("""
+        <div class="medical-header">
+            <h1>🏥 VedaMD Enhanced: Sri Lankan Clinical Assistant</h1>
+            <h3>Enhanced Medical-Grade AI with Advanced RAG & Safety Protocols</h3>
+            <p>✅ 5x Enhanced Retrieval • ✅ Medical Verification • ✅ Clinical ModernBERT • ✅ Source Traceability</p>
+        </div>
+        """)
+        # Description
+        gr.Markdown("""
+        **🩺 Advanced Medical AI Assistant** for Sri Lankan maternal health guidelines with **enhanced safety protocols**:
+        🎯 **Enhanced Features:**
+        - **5x Enhanced Retrieval**: 15+ documents analyzed vs previous 5
+        - **Medical Entity Extraction**: Advanced clinical terminology recognition
+        - **Clinical ModernBERT**: Specialized 768d medical domain embeddings
+        - **Medical Response Verification**: 100% source traceability validation
+        - **Advanced Re-ranking**: Medical relevance scoring with coverage verification
+        - **Safety Protocols**: Comprehensive medical claim verification before delivery
+        **Ask me anything about Sri Lankan clinical guidelines with confidence!** 🇱🇰
+        """)
+        # Chat interface
+        chatbot = gr.ChatInterface(
+            fn=process_enhanced_medical_query,
+            examples=[
+                "What is the complete management protocol for severe preeclampsia in Sri Lankan guidelines?",
+                "How should postpartum hemorrhage be managed according to our local clinical protocols?",
+                "What medications are contraindicated during pregnancy based on Sri Lankan guidelines?",
+                "What are the evidence-based recommendations for managing gestational diabetes?",
+                "How should puerperal sepsis be diagnosed and treated according to our guidelines?",
+                "What are the protocols for assisted vaginal delivery in complicated cases?",
+                "How should intrapartum fever be managed based on Sri Lankan standards?"
+            ],
+            cache_examples=False
+        )
+        # Footer with technical info
+        gr.Markdown("""
+        ---
+        **🔧 Technical Details**: Enhanced RAG with Clinical ModernBERT embeddings, medical entity extraction,
+        response verification, and multi-stage retrieval for comprehensive medical information coverage.
+        **⚖️ Disclaimer**: This AI assistant is for clinical reference only and does not replace professional medical judgment.
+        Always consult with qualified healthcare professionals for patient care decisions.
+        """)
+    return demo
+# Create and launch the interface
+if __name__ == "__main__":
+    logger.info("🚀 Launching VedaMD Enhanced for Hugging Face Spaces...")
+    # Create the interface
+    demo = create_enhanced_medical_interface()
+    # Launch with appropriate settings for HF Spaces
+    demo.launch(
+        server_name="0.0.0.0",
+        server_port=7860,
+        share=False,
+        show_error=True,
+        show_api=False
+    )

frontend/src/app/page.tsx CHANGED Viewed

@@ -4,6 +4,7 @@ import { useState, useRef, useEffect, FC } from 'react';
 import ReactMarkdown from 'react-markdown';
 import remarkGfm from 'remark-gfm';
 import clsx from 'clsx';
 // --- TYPE DEFINITIONS ---
 interface Message {
@@ -183,32 +184,16 @@ export default function Home() {
         setConversation(currentConversation);
         try {
-            // Convert conversation history to Gradio ChatInterface format
-            const gradioHistory = currentConversation.slice(0, -1).map(msg => [
-                msg.role === 'user' ? msg.content : '',
-                msg.role === 'assistant' ? msg.content : ''
-            ]).filter(pair => pair[0] || pair[1]);
-            // Call Gradio ChatInterface API
-            const response = await fetch(`${process.env.NEXT_PUBLIC_HF_API_URL}/call/predict`, {
-                method: 'POST',
-                headers: {
-                    'Content-Type': 'application/json',
-                },
-                body: JSON.stringify({
-                    data: [query, gradioHistory]
-                }),
-            });
-            if (!response.ok) {
-                const errorText = await response.text().catch(() => 'Network error occurred');
-                throw new Error(`API Error: ${response.status} - ${errorText}`);
             }
-            const data = await response.json();
             const botMessage: Message = {
                 role: 'assistant',
-                content: data.data[0] || 'No response received from the medical assistant.'
             };
             setConversation([...currentConversation, botMessage]);
         } catch (err: any) {

 import ReactMarkdown from 'react-markdown';
 import remarkGfm from 'remark-gfm';
 import clsx from 'clsx';
+import { queryAPI } from '@/lib/api';
 // --- TYPE DEFINITIONS ---
 interface Message {
         setConversation(currentConversation);
         try {
+            // Use the queryAPI function from lib/api.ts
+            const apiResponse = await queryAPI(query, currentConversation.slice(0, -1));
+            if (apiResponse.error) {
+                throw new Error(apiResponse.error);
             }
             const botMessage: Message = {
                 role: 'assistant',
+                content: apiResponse.answer
             };
             setConversation([...currentConversation, botMessage]);
         } catch (err: any) {

frontend/src/lib/api.ts CHANGED Viewed

@@ -20,13 +20,14 @@ export async function queryAPI(input: string, history: ChatMessage[] = []): Prom
       throw new Error('HF_API_URL is not configured');
     }
-    // Convert history to Gradio ChatInterface format
     const gradioHistory = history.map(msg => [
       msg.role === 'user' ? msg.content : '',
       msg.role === 'assistant' ? msg.content : ''
     ]).filter(pair => pair[0] || pair[1]);
-    const response = await fetch(`${HF_API_URL}/call/predict`, {
       method: 'POST',
       headers: {
         'Content-Type': 'application/json',
@@ -35,14 +36,15 @@ export async function queryAPI(input: string, history: ChatMessage[] = []): Prom
         data: [input, gradioHistory]
       }),
     });
     if (!response.ok) {
-      throw new Error(`API error: ${response.status}`);
     }
-    const data = await response.json();
     return {
-      answer: data.data?.[0] || 'No response received from the medical assistant.',
       sources: [], // Enhanced backend provides sources within the response text
     };
   } catch (error) {

       throw new Error('HF_API_URL is not configured');
     }
+    // Convert history to Gradio format
     const gradioHistory = history.map(msg => [
       msg.role === 'user' ? msg.content : '',
       msg.role === 'assistant' ? msg.content : ''
     ]).filter(pair => pair[0] || pair[1]);
+    // Use Gradio API format - try the basic predict endpoint
+    const response = await fetch(`${HF_API_URL}/predict`, {
       method: 'POST',
       headers: {
         'Content-Type': 'application/json',
         data: [input, gradioHistory]
       }),
     });
     if (!response.ok) {
+      throw new Error(`HTTP ${response.status}: ${response.statusText}`);
     }
+    const result = await response.json();
     return {
+      answer: result?.data?.[0] || result?.[0] || 'No response received from the medical assistant.',
       sources: [], // Enhanced backend provides sources within the response text
     };
   } catch (error) {