|
|
--- |
|
|
license: mit |
|
|
library_name: scikit-learn |
|
|
tags: |
|
|
- regression |
|
|
- linear-regression |
|
|
- obesity |
|
|
datasets: |
|
|
- ObesityDataSet_raw_and_data_sinthetic.csv |
|
|
model-index: |
|
|
- name: Obesity Weight Prediction Model |
|
|
results: |
|
|
- task: |
|
|
type: regression |
|
|
name: Weight prediction (kg) |
|
|
dataset: |
|
|
name: ObesityDataSet_raw_and_data_sinthetic.csv |
|
|
type: tabular |
|
|
metrics: |
|
|
- type: mean_squared_error |
|
|
value: 511.55 |
|
|
- type: r2 |
|
|
value: 0.2777 |
|
|
--- |
|
|
|
|
|
# Obesity Weight Prediction Model — Linear Regression |
|
|
|
|
|
## Overview |
|
|
This model predicts a person’s **weight (kg)** based on **height (m)** and **age (years)** using a Linear Regression model from scikit-learn. |
|
|
|
|
|
--- |
|
|
|
|
|
## Training |
|
|
|
|
|
| Detail | Value | |
|
|
|--------|-------| |
|
|
| Algorithm | `LinearRegression()` | |
|
|
| Features | Height, Age | |
|
|
| Target | Weight | |
|
|
| Train/Test Split | 75% / 25% | |
|
|
| Random State | 42 | |
|
|
| Dataset | ObesityDataSet_raw_and_data_sinhtetic.csv | |
|
|
|
|
|
--- |
|
|
|
|
|
## Performance |
|
|
|
|
|
| Metric | Score | |
|
|
|--------|-------| |
|
|
| MSE (Mean Squared Error) | **511.55** | |
|
|
| R^2 Score | **0.2777** | |
|
|
|
|
|
These results indicate that height and age alone **do not fully explain** weight — important factors like diet, genetics, and exercise are missing. |
|
|
|
|
|
--- |
|
|
|
|
|
## Visualization |
|
|
|
|
|
Below is a scatter plot showing predicted vs true weights: |
|
|
|
|
|
 |
|
|
|
|
|
The wide spread around the regression line shows prediction uncertainty for heavier individuals. |
|
|
|
|
|
--- |
|
|
|
|
|
## Limitations |
|
|
- Only two features used → reduced explanatory power |
|
|
- Synthetic dataset — not reflective of real population variation |
|
|
- Performance not suitable for real-world medical decisions |
|
|
|
|
|
This model is intended for **educational use only**. |
|
|
|
|
|
--- |
|
|
## Strengths |
|
|
- Easy to interpret |
|
|
- Fast and simple |
|
|
- Good educational model |
|
|
|
|
|
## Weaknesses |
|
|
- Low accuracy |
|
|
- Missing key health variables |
|
|
- Not production-ready |
|
|
|
|
|
## Citation |
|
|
- "Estimation of Obesity Levels Based On Eating Habits and Physical Condition ." UCI Machine Learning Repository, 2019, https://doi.org/10.24432/C5H31Z. |
|
|
|
|
|
|