Spaces:
Sleeping
Sleeping
Local Mode Documentation
Overview
The MedAI Processing system now supports two modes of operation:
- Cloud Mode (default): Uses NVIDIA and Gemini APIs for processing
- Local Mode: Uses MedAlpaca-13b model running locally for processing
Local Mode Features
Local Mode Benefits
- No API costs: Process data without external API calls
- Privacy: All processing happens locally
- Offline capability: Works without internet connection (after model download)
- Medical specialization: Uses MedAlpaca-13b, a model specifically fine-tuned for medical tasks
Technical Details
- Model: MedAlpaca-13b
- Quantization: 4-bit quantization for memory efficiency
- CUDA Support: Automatic GPU acceleration when available
- Memory Management: Automatic model unloading to free memory
Building and Running
Build Script
Use the provided build script for easy building:
# Build for local mode
./build.sh local
# Build for cloud mode
./build.sh cloud
Manual Docker Build
Local Mode
docker build --build-arg IS_LOCAL=true -t medai-processing:local .
Cloud Mode
docker build --build-arg IS_LOCAL=false -t medai-processing:cloud .
Environment Variables
Local Mode Required
IS_LOCAL=true: Enables local modeHF_TOKEN: Hugging Face token for model download (default: provided token)
Local Mode Optional
HF_HOME: Hugging Face cache directory (default: ~/.cache/huggingface)
Cloud Mode Required
IS_LOCAL=false: Enables cloud mode (default)NVIDIA_API_1: NVIDIA API keyGEMINI_API_1: Gemini API key
Output Differences
Local Mode
- Output Location:
data/folder (local filesystem) - No Google Drive: Files are saved locally only
- No OAuth: Google Drive authentication is disabled
Cloud Mode
- Output Location:
cache/outputs/folder - Google Drive: Files are uploaded to Google Drive
- OAuth: Google Drive authentication is available
Model Information
MedAlpaca-13b
- Size: 13 billion parameters
- Specialization: Medical domain tasks
- Training Data:
- ChatDoctor (200k Q&A pairs)
- WikiDoc (67k items)
- StackExchange (academia, biology, fitness, health)
- Anki flashcards (33k items)
Performance Considerations
- Memory: Requires ~8GB RAM (with 4-bit quantization)
- GPU: CUDA acceleration recommended for faster inference
- Storage: Model download requires ~7GB disk space
Usage Examples
Processing with Local Mode
- Set
IS_LOCAL=truein environment - Provide
HF_TOKENfor model access - Run processing jobs - they will use MedAlpaca locally
- Output files will be saved to
data/folder
Processing with Cloud Mode
- Set
IS_LOCAL=false(or omit) - Provide NVIDIA and Gemini API keys
- Run processing jobs - they will use external APIs
- Output files will be uploaded to Google Drive
Troubleshooting
Local Mode Issues
- Model download fails: Check HF_TOKEN and internet connection
- Out of memory: Ensure sufficient RAM (8GB+ recommended)
- Slow inference: Enable CUDA if available
Cloud Mode Issues
- API errors: Check API keys and quotas
- Upload failures: Verify Google Drive authentication
Migration Guide
From Cloud to Local
- Update environment:
IS_LOCAL=true - Add HF_TOKEN
- Rebuild container with local mode
- Output will switch from Google Drive to local
data/folder
From Local to Cloud
- Update environment:
IS_LOCAL=false - Add NVIDIA and Gemini API keys
- Rebuild container with cloud mode
- Output will switch from local to Google Drive