med-gemma / README.md
marynab's picture
add readme
d63d7af
metadata
title: Med Gemma
emoji: πŸ₯
colorFrom: red
colorTo: red
sdk: docker
app_port: 8501
tags:
  - streamlit
  - medical
  - chatbot
pinned: false
short_description: Med-Gemma Medical Assistant Chat Interface

πŸ₯ Med-Gemma Medical Assistant

A Streamlit chat interface for interacting with the Med-Gemma medical language model deployed on HuggingFace Inference Endpoints.

πŸš€ Deployment to HuggingFace Spaces

Step 1: Configure Secrets

  1. Go to your Space settings on HuggingFace
  2. Navigate to Settings β†’ Variables and secrets
  3. Add these two secrets:
    • Name: HF_TOKEN | Value: Your HuggingFace API token
    • Name: INFERENCE_ENDPOINT | Value: Your inference endpoint URL (e.g., https://xxx.endpoints.huggingface.cloud)

Step 2: Get Your Credentials

HuggingFace API Token:

  1. Go to HuggingFace Settings - Tokens
  2. Click "New token", give it a name, select "read" permissions
  3. Copy the token

Inference Endpoint URL:

  1. Go to HuggingFace Inference Endpoints
  2. Find your Med-Gemma endpoint (must be "Running")
  3. Copy the endpoint URL

Step 3: Deploy

  1. Push your code to the HuggingFace Space repository
  2. The Space will automatically build and deploy
  3. Once ready, users can start chatting immediately - no configuration needed!

πŸ› οΈ Local Development

Setup

  1. Create virtual environment:

    python -m venv venv
    venv\Scripts\activate
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Configure credentials: Create a .env file in the project root:

    HF_TOKEN=your_token_here
    INFERENCE_ENDPOINT=your_endpoint_url_here
    
  4. Run the app:

    streamlit run src/streamlit_app.py
    

    The app will open at http://localhost:8501

πŸ“ Features

  • πŸ’¬ Real-time chat interface with Med-Gemma
  • βš™οΈ Adjustable model parameters (temperature, top_p, max tokens)
  • πŸ“ Chat history (persists during session)
  • πŸ—‘οΈ Clear chat history button
  • πŸ”’ Secure credential management via environment variables
  • βœ… OpenAI-compatible API format for vLLM endpoints
  • 🎨 Clean, professional UI
  • πŸš€ Docker-ready for HuggingFace Spaces

βš™οΈ Model Parameters

  • Max Tokens: Maximum length of the generated response (50-2048)
  • Temperature: Controls randomness (0.0 = deterministic, 2.0 = very random)
  • Top P: Controls diversity via nucleus sampling (0.0-1.0)

πŸ”’ Security

  • βœ… .env file is in .gitignore - never committed
  • βœ… Use HuggingFace Spaces secrets for production
  • βœ… Local .env file for development only
  • ⚠️ Never share your HuggingFace API tokens

πŸ“š Additional Resources