Spaces:
Sleeping
Running Model Optimization with Docker
This guide shows you how to run the model optimization scripts using Docker.
Prerequisites
- Docker installed and running
- Docker Compose (usually comes with Docker Desktop)
- At least 8GB RAM available for Docker
- Data file:
content/cardio_train_extended.csv
Quick Start
Option 1: Using Docker Compose (Recommended)
# Build and run optimization
docker-compose -f docker-compose.optimization.yml up --build
# Run in detached mode (background)
docker-compose -f docker-compose.optimization.yml up -d --build
# View logs
docker-compose -f docker-compose.optimization.yml logs -f
# Stop when done
docker-compose -f docker-compose.optimization.yml down
Option 2: Using Docker Directly
# Build the image
docker build -f Dockerfile.optimization -t heart-optimization .
# Run optimization
docker run --rm \
-v "$(pwd)/content:/app/content" \
-v "$(pwd)/model_assets:/app/model_assets:ro" \
--name heart-optimization \
heart-optimization
# Run with resource limits
docker run --rm \
-v "$(pwd)/content:/app/content" \
-v "$(pwd)/model_assets:/app/model_assets:ro" \
--cpus="4" \
--memory="8g" \
--name heart-optimization \
heart-optimization
Running Specific Scripts
Run Model Optimization Only
docker-compose -f docker-compose.optimization.yml run --rm optimization python improve_models.py
Run Feature Analysis Only
docker-compose -f docker-compose.optimization.yml run --rm optimization python feature_importance_analysis.py
Run Comparison
docker-compose -f docker-compose.optimization.yml run --rm optimization python compare_models.py
Customization
Adjust Resource Limits
Edit docker-compose.optimization.yml:
deploy:
resources:
limits:
cpus: '8' # Use more CPUs if available
memory: 16G # More RAM for faster processing
Reduce Optimization Time
Edit improve_models.py before building:
n_trials = 50 # Reduce from 100 to 50 for faster results
Or override at runtime:
docker run --rm \
-v "$(pwd)/content:/app/content" \
-v "$(pwd)/improve_models.py:/app/improve_models.py" \
heart-optimization python -c "
import sys
sys.path.insert(0, '/app')
# Modify n_trials here or use environment variable
exec(open('/app/improve_models.py').read().replace('n_trials = 100', 'n_trials = 50'))
"
Use Environment Variables
Create a .env file:
N_TRIALS=50
STUDY_TIMEOUT=1800
Then use it:
docker-compose -f docker-compose.optimization.yml --env-file .env up
Monitoring Progress
View Real-time Logs
# Using docker-compose
docker-compose -f docker-compose.optimization.yml logs -f
# Using docker
docker logs -f heart-optimization
Check Container Status
docker ps
docker stats heart-optimization
Results Location
All results are saved to your host machine in:
content/models/- Optimized models and metricscontent/reports/- Feature importance visualizations
These persist after the container stops.
Troubleshooting
Out of Memory
Error: Killed or memory errors
Solution:
- Reduce
n_trialsinimprove_models.py - Reduce memory limit in docker-compose.yml
- Close other applications
Build Fails
Error: Package installation fails
Solution:
# Clean build
docker-compose -f docker-compose.optimization.yml build --no-cache
Data Not Found
Error: Data file not found
Solution:
# Verify data file exists
ls -lh content/cardio_train_extended.csv
# Check volume mount
docker-compose -f docker-compose.optimization.yml config
Slow Performance
Solutions:
- Increase CPU allocation in docker-compose.yml
- Use fewer trials:
n_trials = 30 - Run on a machine with more resources
Advanced Usage
Interactive Shell
# Get a shell in the container
docker-compose -f docker-compose.optimization.yml run --rm optimization bash
# Then run scripts manually
python improve_models.py
Run Multiple Optimizations
# Run optimization with different trial counts
for trials in 30 50 100; do
docker run --rm \
-v "$(pwd)/content:/app/content" \
-e N_TRIALS=$trials \
heart-optimization \
python -c "import sys; sys.path.insert(0, '/app'); exec(open('/app/improve_models.py').read().replace('n_trials = 100', f'n_trials = {trials}'))"
done
Save Container State
# Commit container to image
docker commit heart-optimization heart-optimization:snapshot
# Use later
docker run --rm -v "$(pwd)/content:/app/content" heart-optimization:snapshot
Performance Tips
- Use SSD storage - Faster I/O for data loading
- Allocate more CPUs - Parallel processing in Optuna
- Increase memory - Better for large datasets
- Run overnight - Let it run while you sleep
- Use GPU (if available) - Requires NVIDIA Docker runtime
GPU Support (Optional)
If you have an NVIDIA GPU:
# Add to docker-compose.optimization.yml
runtime: nvidia
environment:
- NVIDIA_VISIBLE_DEVICES=all
Then build with:
docker build -f Dockerfile.optimization -t heart-optimization .
Example Workflow
# 1. Build image
docker-compose -f docker-compose.optimization.yml build
# 2. Run optimization (takes 1-2 hours)
docker-compose -f docker-compose.optimization.yml up
# 3. In another terminal, check progress
docker-compose -f docker-compose.optimization.yml logs -f
# 4. When done, run feature analysis
docker-compose -f docker-compose.optimization.yml run --rm optimization \
python feature_importance_analysis.py
# 5. Compare results
docker-compose -f docker-compose.optimization.yml run --rm optimization \
python compare_models.py
# 6. Clean up
docker-compose -f docker-compose.optimization.yml down
Benefits of Using Docker
✅ Isolation - No conflicts with your system Python
✅ Reproducibility - Same environment every time
✅ Resource Control - Limit CPU/memory usage
✅ Easy Cleanup - Remove container when done
✅ Portability - Run on any machine with Docker
Next Steps
After optimization completes:
- Check results in
content/models/model_metrics_optimized.csv - Review feature importance in
content/reports/ - Compare with baseline using
compare_models.py - Deploy optimized models to your Streamlit app
Note: The optimization process can take 1-2 hours. Make sure your laptop is plugged in and won't go to sleep!