Linear Regression Model for EU Hospital Wait Times
Model Description
This is a custom-implemented Linear Regression model trained to predict European Union hospital wait times. The model was built from scratch using NumPy and trained with a Mini-Batch Gradient Descent optimization algorithm.
Purpose
The primary purpose of this model is to provide a predictive tool for total hospital wait times based on various categories of wait times (e.g., 'FourAndUnder_sum', 'FiveToTwelve_sum', 'OverTwelve_sum'). The target variable, 'Total_sum', was log-transformed (np.log1p) during training to handle potential skewness and improve model performance.
Training Data
The model was trained on the EUHospitalWaitTime.csv dataset. The features used for training include:
YearMthAndYrCodeFourAndUnder_sum(Sum of patients waiting 4 hours and under)FiveToTwelve_sum(Sum of patients waiting 5 to 12 hours)OverTwelve_sum(Sum of patients waiting over 12 hours)
The target variable is Total_sum (Total sum of patients waiting).
Performance Metrics
After training, the model's performance was evaluated on a validation set.
- Mean Squared Error (MSE): 0.3396
- R-squared (R²): 0.9513
Limitations
- This is a simple linear model, which may not capture complex non-linear relationships in the data.
- The log-transformation of the target variable means predictions need to be inverse-transformed (
np.expm1) to get the actual scale of wait times. - The dataset used might have specific characteristics or biases that could affect generalization to other datasets.
- The model is trained on aggregate sum data, not individual patient data.