Update README.md
Browse files
README.md
CHANGED
|
@@ -22,8 +22,7 @@ EDA was performed in the Colab notebook to understand distributions and relation
|
|
| 22 |
- `Smoking_Prevalence` is roughly centered around ~27–28 with moderate spread.
|
| 23 |
- Many features such as `Peer_Influence`, `Media_Influence`, `Mental_Health` are on a 1–10 scale.
|
| 24 |
- No strong class imbalance issues were found once the target was binarized (see classification section).
|
| 25 |
-
|
| 26 |
-
> (Place your distribution plots here, e.g.)
|
| 27 |
> 
|
| 28 |
|
| 29 |
### Relationships:
|
|
@@ -61,11 +60,11 @@ Key exploratory plots included:
|
|
| 61 |
|
| 62 |
The goal of the regression model is to predict Smoking_Prevalence based on demographic, behavioral, and social factors
|
| 63 |
|
| 64 |
-
|
| 65 |
|
| 66 |
Conclusion of the feature importance:
|
| 67 |
|
| 68 |
-
|
| 69 |
|
| 70 |
The baseline linear regression model performed poorly, indicating that smoking behavior is not well captured through a simple linear relationship with the selected features. This highlights the need for more advanced modeling techniques and potential feature engineering.
|
| 71 |
|
|
@@ -89,7 +88,7 @@ To improve model performance, several engineered features were created:
|
|
| 89 |
|
| 90 |
These engineered features were re-used both for regression and classification tasks.
|
| 91 |
|
| 92 |
-
|
| 93 |
|
| 94 |
---
|
| 95 |
|
|
|
|
| 22 |
- `Smoking_Prevalence` is roughly centered around ~27–28 with moderate spread.
|
| 23 |
- Many features such as `Peer_Influence`, `Media_Influence`, `Mental_Health` are on a 1–10 scale.
|
| 24 |
- No strong class imbalance issues were found once the target was binarized (see classification section).
|
| 25 |
+
|
|
|
|
| 26 |
> 
|
| 27 |
|
| 28 |
### Relationships:
|
|
|
|
| 60 |
|
| 61 |
The goal of the regression model is to predict Smoking_Prevalence based on demographic, behavioral, and social factors
|
| 62 |
|
| 63 |
+
> 
|
| 64 |
|
| 65 |
Conclusion of the feature importance:
|
| 66 |
|
| 67 |
+
> 
|
| 68 |
|
| 69 |
The baseline linear regression model performed poorly, indicating that smoking behavior is not well captured through a simple linear relationship with the selected features. This highlights the need for more advanced modeling techniques and potential feature engineering.
|
| 70 |
|
|
|
|
| 88 |
|
| 89 |
These engineered features were re-used both for regression and classification tasks.
|
| 90 |
|
| 91 |
+
> 
|
| 92 |
|
| 93 |
---
|
| 94 |
|