MediVox

Sleeping

App Files Files Community

MediVox / README.md

gauravgulati619

feat: update to Gemini, add optional inputs, and apply new theme

ef46851 about 1 year ago

preview code

raw

history blame contribute delete

1.53 kB

A newer version of the Gradio SDK is available: 6.14.0

Upgrade

metadata

title: MediVox - AI Doctor with Vision and Voice
emoji: 👨‍⚕️
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.19.0
app_file: app.py
pinned: false

AI Doctor with Vision and Voice

This is an AI-powered medical assistant that can:

Accept voice input from patients
Analyze medical images
Provide medical insights using RAG (Retrieval Augmented Generation)
Respond with natural voice output

Features

Speech-to-Text using Whisper
Image Analysis using LLaVA
RAG using FAISS and medical knowledge base
Text-to-Speech using ElevenLabs
Context-aware responses using medical domain knowledge

Environment Variables Required

GOOGLE_AI_STUDIO_API_KEY=your_google_ai_studio_api_key
ELEVENLABS_API_KEY=your_elevenlabs_api_key

Usage

Click the microphone button to record your question (optional)
Upload or take a picture of the medical condition (optional)
Either input method can be used independently or together
Wait for the AI doctor to analyze and respond
Listen to the voice response or read the text output

Model Details

Vision Model: Google Gemini 2.0 Flash
Speech-to-Text: Google Gemini 2.0 Flash
Text Generation: Google Gemini 2.0 Flash
Voice Generation: ElevenLabs
Embeddings: sentence-transformers/all-mpnet-base-v2

Citation

If you use this space, please cite:

@misc{medivoicebot2024,
  author = {Gaurav Gulati},
  title = {AI Doctor with Vision and Voice},
  year = {2024},
  publisher = {Hugging Face Spaces},
}