Upload README.md with huggingface_hub
Browse files
README.md
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
library_name: 'huggingface_hub'
|
| 3 |
+
tags:
|
| 4 |
+
- regression
|
| 5 |
+
- linear-regression
|
| 6 |
+
- custom-model
|
| 7 |
+
---
|
| 8 |
+
# Linear Regression Model for EU Hospital Wait Times
|
| 9 |
+
|
| 10 |
+
## Model Description
|
| 11 |
+
This is a custom-implemented Linear Regression model trained to predict European Union hospital wait times. The model was built from scratch using NumPy and trained with a Mini-Batch Gradient Descent optimization algorithm.
|
| 12 |
+
|
| 13 |
+
## Purpose
|
| 14 |
+
The primary purpose of this model is to provide a predictive tool for total hospital wait times based on various categories of wait times (e.g., 'FourAndUnder_sum', 'FiveToTwelve_sum', 'OverTwelve_sum'). The target variable, 'Total_sum', was log-transformed (`np.log1p`) during training to handle potential skewness and improve model performance.
|
| 15 |
+
|
| 16 |
+
## Training Data
|
| 17 |
+
The model was trained on the `EUHospitalWaitTime.csv` dataset. The features used for training include:
|
| 18 |
+
- `Year`
|
| 19 |
+
- `MthAndYrCode`
|
| 20 |
+
- `FourAndUnder_sum` (Sum of patients waiting 4 hours and under)
|
| 21 |
+
- `FiveToTwelve_sum` (Sum of patients waiting 5 to 12 hours)
|
| 22 |
+
- `OverTwelve_sum` (Sum of patients waiting over 12 hours)
|
| 23 |
+
|
| 24 |
+
The target variable is `Total_sum` (Total sum of patients waiting).
|
| 25 |
+
|
| 26 |
+
## Performance Metrics
|
| 27 |
+
After training, the model's performance was evaluated on a validation set.
|
| 28 |
+
- **Mean Squared Error (MSE)**: 0.3396
|
| 29 |
+
- **R-squared (R²)**: 0.9513
|
| 30 |
+
|
| 31 |
+
## Limitations
|
| 32 |
+
- This is a simple linear model, which may not capture complex non-linear relationships in the data.
|
| 33 |
+
- The log-transformation of the target variable means predictions need to be inverse-transformed (`np.expm1`) to get the actual scale of wait times.
|
| 34 |
+
- The dataset used might have specific characteristics or biases that could affect generalization to other datasets.
|
| 35 |
+
- The model is trained on aggregate sum data, not individual patient data.
|