Spaces:

kbsss
/

Heart-Attack-Risk-Rate

Sleeping

App Files Files Community

Heart-Attack-Risk-Rate / COLAB_COMPARISON.md

Kasilanka Bhoopesh Siva Srikar

Complete Heart Attack Risk Prediction App - Ready for Deployment

08123aa 3 months ago

preview code

raw

history blame contribute delete

5.34 kB

Google Colab Time Estimate & Setup Guide

⏱️ Time Comparison

Current Local Setup (Docker)

CPUs: 2 cores
Memory: 4 GB
Total Time: ~24.4 hours
- XGBoost: ~2.9 hours
- CatBoost: ~12.5 hours
- LightGBM: ~9.0 hours

🆓 Google Colab Free Tier (CPU Only)

Specifications

CPUs: 1-2 cores (variable, shared resources)
Memory: ~12.7 GB RAM
GPU: None
Session Timeout: 12 hours (disconnects after inactivity)

Estimated Time

Total: ~30.5 hours (20% slower than local)
- XGBoost: ~3.7 hours
- CatBoost: ~15.6 hours
- LightGBM: ~11.3 hours

⚠️ Limitations

May timeout before completion (12-hour limit)
Slower due to shared resources
May need to restart and resume from checkpoints

🎮 Google Colab Free Tier + GPU (T4)

Specifications

CPUs: 1-2 cores
Memory: ~12.7 GB RAM
GPU: NVIDIA T4 (16 GB)
Session Timeout: 12 hours

Estimated Time

Total: ~18.0 hours (26% faster than local)
- XGBoost: ~1.9 hours (50% faster with GPU)
- CatBoost: ~9.6 hours (30% faster with GPU)
- LightGBM: ~6.4 hours (40% faster with GPU)

⚠️ Limitations

May timeout before completion (12-hour limit)
GPU availability not guaranteed (may need to wait)
Requires code modifications for GPU support

💎 Google Colab Pro ($10/month)

Specifications

CPUs: 2-4 cores (better allocation)
Memory: ~32 GB RAM
GPU: Better GPU access (T4/V100)
Session Timeout: 24 hours
Background Execution: Yes

Estimated Time (CPU)

Total: ~20.4 hours (17% faster than local)
- XGBoost: ~2.4 hours
- CatBoost: ~10.4 hours
- LightGBM: ~7.5 hours

Estimated Time (with GPU)

Total: ~15.0 hours (39% faster than local)
- XGBoost: ~1.6 hours
- CatBoost: ~8.0 hours
- LightGBM: ~5.4 hours

✅ Advantages

Longer session time (24 hours)
Background execution (can close browser)
Better resource allocation
More reliable GPU access

📊 Summary Table

Platform	CPUs	GPU	Total Time	Cost	Session Limit
Local (Docker)	2	No	~24.4 hrs	Free	None
Colab Free (CPU)	1-2	No	~30.5 hrs	Free	12 hrs ⚠️
Colab Free (GPU)	1-2	T4	~18.0 hrs	Free	12 hrs ⚠️
Colab Pro (CPU)	2-4	No	~20.4 hrs	$10/mo	24 hrs
Colab Pro (GPU)	2-4	T4/V100	~15.0 hrs	$10/mo	24 hrs

🚀 Setting Up for Google Colab

1. Enable GPU (if using)

# In Colab, go to: Runtime → Change runtime type → Hardware accelerator → GPU

2. Install Dependencies

!pip install xgboost catboost lightgbm optuna pandas numpy scikit-learn joblib

3. Upload Data

from google.colab import files
# Upload cardio_train_extended.csv
uploaded = files.upload()

4. Modify Code for GPU Support

You'll need to modify improve_models.py to enable GPU:

For XGBoost:

# Change tree_method to use GPU
xgb_params = {
    'tree_method': 'gpu_hist',  # Enable GPU
    'device': 'cuda',  # Use CUDA
    # ... other parameters
}

For CatBoost:

cat_params = {
    'task_type': 'GPU',  # Enable GPU
    'devices': '0',  # Use first GPU
    # ... other parameters
}

For LightGBM:

lgb_params = {
    'device': 'gpu',  # Enable GPU
    'gpu_platform_id': 0,
    'gpu_device_id': 0,
    # ... other parameters
}

5. Handle Session Timeouts

For long-running training, save checkpoints:

import pickle

# Save study state periodically
def save_checkpoint(study, trial):
    if trial.number % 50 == 0:
        with open('study_checkpoint.pkl', 'wb') as f:
            pickle.dump(study, f)

# Load checkpoint if resuming
try:
    with open('study_checkpoint.pkl', 'rb') as f:
        study = pickle.load(f)
except FileNotFoundError:
    study = optuna.create_study(...)

💡 Recommendations

Best Option: Colab Pro + GPU

✅ Fastest completion (~15 hours)
✅ 24-hour session limit (enough time)
✅ Background execution
✅ Most reliable

Budget Option: Colab Free + GPU

✅ Free
✅ Faster than local (~18 hours)
⚠️ May timeout (12-hour limit)
⚠️ Need to implement checkpointing

Local Option: Keep Current Setup

✅ No cost
✅ No timeouts
✅ Full control
⚠️ Slower (~24 hours)

📝 Important Notes

GPU Acceleration: Requires code modifications to enable GPU support in XGBoost, CatBoost, and LightGBM
Session Limits: Free tier has 12-hour limits - may need to restart
Resource Availability: Colab resources vary - actual times may differ
Checkpointing: Essential for long runs on free tier
Data Upload: Need to upload dataset to Colab (or use Google Drive)

🔧 Quick Colab Setup Script

# Run this in a Colab cell
!pip install xgboost catboost lightgbm optuna pandas numpy scikit-learn joblib

# Enable GPU (if available)
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

# Upload your data file
from google.colab import files
uploaded = files.upload()

# Then run your improve_models.py script
# (with GPU modifications)

Last Updated: November 9, 2025