MayaKitzis commited on
Commit
0795eff
·
verified ·
1 Parent(s): dec7bc9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -5
README.md CHANGED
@@ -22,8 +22,7 @@ EDA was performed in the Colab notebook to understand distributions and relation
22
  - `Smoking_Prevalence` is roughly centered around ~27–28 with moderate spread.
23
  - Many features such as `Peer_Influence`, `Media_Influence`, `Mental_Health` are on a 1–10 scale.
24
  - No strong class imbalance issues were found once the target was binarized (see classification section).
25
-
26
- > (Place your distribution plots here, e.g.)
27
  > ![1 - Smoking Prevalence Distribution](1_Smoking_Prevalence_Distribution.png)
28
 
29
  ### Relationships:
@@ -61,11 +60,11 @@ Key exploratory plots included:
61
 
62
  The goal of the regression model is to predict Smoking_Prevalence based on demographic, behavioral, and social factors
63
 
64
- גרף Actual vs Predicted
65
 
66
  Conclusion of the feature importance:
67
 
68
- גרף feature importance
69
 
70
  The baseline linear regression model performed poorly, indicating that smoking behavior is not well captured through a simple linear relationship with the selected features. This highlights the need for more advanced modeling techniques and potential feature engineering.
71
 
@@ -89,7 +88,7 @@ To improve model performance, several engineered features were created:
89
 
90
  These engineered features were re-used both for regression and classification tasks.
91
 
92
- גרף קלסטרז
93
 
94
  ---
95
 
 
22
  - `Smoking_Prevalence` is roughly centered around ~27–28 with moderate spread.
23
  - Many features such as `Peer_Influence`, `Media_Influence`, `Mental_Health` are on a 1–10 scale.
24
  - No strong class imbalance issues were found once the target was binarized (see classification section).
25
+
 
26
  > ![1 - Smoking Prevalence Distribution](1_Smoking_Prevalence_Distribution.png)
27
 
28
  ### Relationships:
 
60
 
61
  The goal of the regression model is to predict Smoking_Prevalence based on demographic, behavioral, and social factors
62
 
63
+ > ![Actual_vs_Predicted](Actual_vs_Predicted.png)
64
 
65
  Conclusion of the feature importance:
66
 
67
+ > ![Linear_Regression_Coefficients](Linear_Regression_Coefficients.png)
68
 
69
  The baseline linear regression model performed poorly, indicating that smoking behavior is not well captured through a simple linear relationship with the selected features. This highlights the need for more advanced modeling techniques and potential feature engineering.
70
 
 
88
 
89
  These engineered features were re-used both for regression and classification tasks.
90
 
91
+ > ![clusters](clusters.png)
92
 
93
  ---
94