File size: 2,358 Bytes
2fbcfac
5600354
2fbcfac
 
 
 
 
 
5600354
2fbcfac
5600354
2fbcfac
 
c2ae316
9b4bd5d
 
 
c2ae316
9b4bd5d
 
 
 
 
c2ae316
9b4bd5d
 
 
 
 
 
 
 
c2ae316
9b4bd5d
c2ae316
9b4bd5d
 
 
 
 
 
 
 
 
 
 
c2ae316
9b4bd5d
 
 
 
 
 
 
 
 
 
 
 
c2ae316
9b4bd5d
 
 
 
 
 
 
 
c2ae316
9b4bd5d
 
 
 
 
 
 
 
 
2fbcfac
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
---
title: GreenPrint AI
emoji: 🚀
colorFrom: red
colorTo: red
sdk: docker
app_port: 8501
tags:
- streamlit
pinned: false
short_description: A futuristic take on carbon footprint + AI.
---

# 🌿 GreenPrint AI

> A futuristic take on carbon footprint + AI.

## Overview

**GreenPrint AI** is a user-friendly web app that predicts your **carbon footprint** based on daily activities and suggests actionable steps to reduce it. From energy consumption to travel and food habits, the app personalizes insights using machine learning.

---

## Aim

To develop a carbon footprint detection system that:
- Takes input as user activities (e.g., electricity use, transport habits, meat consumption).
- Predicts the carbon footprint in kilograms of CO₂ equivalent.
- Recommends practical, low-carbon alternatives.

---

## Step-by-Step Project Workflow

### Step 1: Dataset Creation

- A synthetic dataset named `synthetic_carbon_footprint.csv` was generated using realistic formulas for CO₂ emissions from:
  - **Electricity consumption**
  - **Travel (personal & public)**
  - **Dietary choices (meat)**
  - **Plastic usage**
- These were computed using coefficients (e.g., kg CO₂ per kWh, km, gram of meat).
- The script `Creating_Dataset.ipynb` contains the exact formula-based generation logic.

---

### Step 2: Model Training

- Model Used: RandomForestRegressor from scikit-learn
- Training done using `Running_Model.ipynb`
- Preprocessing involved:
  - Feature scaling (`StandardScaler`)
  - Ensuring input columns align with model_columns.pkl
  - Splitting dataset into training/testing sets
- Evaluation metrics:
  - **R² Score** for goodness-of-fit
  - **Root Mean Squared Error (RMSE)** for prediction accuracy
- The trained model was saved as `carbon_model.pkl`

### Why Random Forest Regressor?

- It handles non-linear relationships and interactions between features better than linear models.
- It’s robust to overfitting due to its ensemble nature.
- Outperformed linear models in testing, giving: Higher R² and Lower RMSE
- Offers feature importance insights, useful for explainability in a sustainability context.
  
---

###  Step 3: Feature Metadata Handling

- A `model_columns.pkl` file was created to store the expected feature column order.
- This prevents column mismatch during inference in the Streamlit frontend.

---