MedAI_Processing / docs /LOCAL_MODE.md
LiamKhoaLe's picture
Upd local setups with dynamic mode setter
a89888b

Local Mode Documentation

Overview

The MedAI Processing system now supports two modes of operation:

  • Cloud Mode (default): Uses NVIDIA and Gemini APIs for processing
  • Local Mode: Uses MedAlpaca-13b model running locally for processing

Local Mode Features

Local Mode Benefits

  • No API costs: Process data without external API calls
  • Privacy: All processing happens locally
  • Offline capability: Works without internet connection (after model download)
  • Medical specialization: Uses MedAlpaca-13b, a model specifically fine-tuned for medical tasks

Technical Details

  • Model: MedAlpaca-13b
  • Quantization: 4-bit quantization for memory efficiency
  • CUDA Support: Automatic GPU acceleration when available
  • Memory Management: Automatic model unloading to free memory

Building and Running

Build Script

Use the provided build script for easy building:

# Build for local mode
./build.sh local

# Build for cloud mode  
./build.sh cloud

Manual Docker Build

Local Mode

docker build --build-arg IS_LOCAL=true -t medai-processing:local .

Cloud Mode

docker build --build-arg IS_LOCAL=false -t medai-processing:cloud .

Environment Variables

Local Mode Required

  • IS_LOCAL=true: Enables local mode
  • HF_TOKEN: Hugging Face token for model download (default: provided token)

Local Mode Optional

  • HF_HOME: Hugging Face cache directory (default: ~/.cache/huggingface)

Cloud Mode Required

  • IS_LOCAL=false: Enables cloud mode (default)
  • NVIDIA_API_1: NVIDIA API key
  • GEMINI_API_1: Gemini API key

Output Differences

Local Mode

  • Output Location: data/ folder (local filesystem)
  • No Google Drive: Files are saved locally only
  • No OAuth: Google Drive authentication is disabled

Cloud Mode

  • Output Location: cache/outputs/ folder
  • Google Drive: Files are uploaded to Google Drive
  • OAuth: Google Drive authentication is available

Model Information

MedAlpaca-13b

  • Size: 13 billion parameters
  • Specialization: Medical domain tasks
  • Training Data:
    • ChatDoctor (200k Q&A pairs)
    • WikiDoc (67k items)
    • StackExchange (academia, biology, fitness, health)
    • Anki flashcards (33k items)

Performance Considerations

  • Memory: Requires ~8GB RAM (with 4-bit quantization)
  • GPU: CUDA acceleration recommended for faster inference
  • Storage: Model download requires ~7GB disk space

Usage Examples

Processing with Local Mode

  1. Set IS_LOCAL=true in environment
  2. Provide HF_TOKEN for model access
  3. Run processing jobs - they will use MedAlpaca locally
  4. Output files will be saved to data/ folder

Processing with Cloud Mode

  1. Set IS_LOCAL=false (or omit)
  2. Provide NVIDIA and Gemini API keys
  3. Run processing jobs - they will use external APIs
  4. Output files will be uploaded to Google Drive

Troubleshooting

Local Mode Issues

  • Model download fails: Check HF_TOKEN and internet connection
  • Out of memory: Ensure sufficient RAM (8GB+ recommended)
  • Slow inference: Enable CUDA if available

Cloud Mode Issues

  • API errors: Check API keys and quotas
  • Upload failures: Verify Google Drive authentication

Migration Guide

From Cloud to Local

  1. Update environment: IS_LOCAL=true
  2. Add HF_TOKEN
  3. Rebuild container with local mode
  4. Output will switch from Google Drive to local data/ folder

From Local to Cloud

  1. Update environment: IS_LOCAL=false
  2. Add NVIDIA and Gemini API keys
  3. Rebuild container with cloud mode
  4. Output will switch from local to Google Drive