uleeberber commited on
Commit
cfaf9dd
·
verified ·
1 Parent(s): f2f2854

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -139,7 +139,7 @@ In conclusion Duration and Activity Type are the dominant predictors, validating
139
 
140
  # **Part 4: Feature Engineering**
141
 
142
- While the baseline Linear Regression model performed well ($R^2 \approx 0.967$), the residual analysis revealed curved patterns, suggesting that the relationship between predictors and calorie burn is non-linear. To address this and capture complex workout behaviors better, I engineered 6 new features before training advanced models.
143
 
144
  **1. Heart_Range**
145
 
@@ -392,6 +392,6 @@ The trained Random Forest pipeline, including the scaler, clustering model, thre
392
 
393
  # **Conclusion**
394
 
395
- The analysis identified the Random Forest algorithm as the superior model, achieving near-perfect performance for both regression (R^2 = 0.9999) and classification (99.88% Accuracy).While these metrics demonstrate exceptional predictive power, the remarkably high accuracy, combined with the unexpectedly low feature importance of Heart Rate and Weight, suggests the underlying dataset is likely synthetic. In real-world physiology, heart rate and body mass are critical drivers of energy expenditure; their lower correlation here indicates the data was likely generated using a deterministic formula heavily weighted toward Duration and Activity Type.
396
 
397
 
 
139
 
140
  # **Part 4: Feature Engineering**
141
 
142
+ While the baseline Linear Regression model performed well (R^2 = 0.967), the residual analysis revealed curved patterns, suggesting that the relationship between predictors and calorie burn is non-linear. To address this and capture complex workout behaviors better, I engineered 6 new features before training advanced models.
143
 
144
  **1. Heart_Range**
145
 
 
392
 
393
  # **Conclusion**
394
 
395
+ The analysis identified the Random Forest algorithm as the superior model, achieving near-perfect performance for both regression (R^2 = 0.9999) and classification (99.88% Accuracy).While these metrics demonstrate exceptional predictive power, the remarkably high accuracy, combined with the unexpectedly low feature importance of Heart Rate, Weight, BMI and Height, suggests that this dataset might not be correct. In real-world physiology, heart rate and body mass are critical drivers of energy expenditure; their lower correlation here indicates the data was likely generated and heavily weighted toward Duration and Activity Type.
396
 
397