File size: 3,921 Bytes
e2ab41b
cf1136b
9f075f7
 
 
 
 
 
cf1136b
e2ab41b
9f075f7
5cae479
eaa2fc9
 
 
f881c94
 
 
 
eaa2fc9
f881c94
87c1f4c
f881c94
87c1f4c
f881c94
87c1f4c
a16dfb8
 
 
 
 
5cae479
 
 
 
87c1f4c
f881c94
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1412774
 
f881c94
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a16dfb8
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
---
title: E-Commerce ELT
emoji: πŸƒ
colorFrom: indigo
colorTo: purple
sdk: docker
pinned: true
license: mit
short_description: Extract, Load, Transform Pipeline applied to an E-Commerce
---

# πŸ“¦ E-Commerce ELT Pipeline

## Table of Contents

1. [Project Description](#1-project-description)
2. [Methodology & Key Features](#2-methodology--key-features)
3. [Technology Stack](#3-technology-stack)
4. [Dataset](#4-dataset)

## 1. Project Description

This project showcases an Extract, Load, and Transform (ELT) pipeline applied to a real-world e-commerce dataset. The primary goal is to extract valuable business insights from transactional data and present them through an interactive dashboard. The pipeline integrates data from the **Brazilian E-Commerce Public Dataset by Olist**, which contains over **100,000 orders** from 2016 to 2018, and also incorporates data from the Public Holiday API to analyze sales performance during national holidays.

The dashboard provides a detailed view of the e-commerce experience, including:

- Order status, prices, and payment types
- Freight and delivery performance
- Customer locations and product categories
- Customer reviews and satisfaction

> [!IMPORTANT]
>
> - Check out the deployed app here: πŸ‘‰οΈ [E-Commerce ELT](https://huggingface.co/spaces/iBrokeTheCode/E-Commerce_ELT) πŸ‘ˆοΈ
> - Check out the Jupyter Notebook for a detailed walkthrough of the project here: πŸ‘‰οΈ [Jupyter Notebook](https://huggingface.co/spaces/iBrokeTheCode/E-Commerce_ELT/blob/main/tutorial_app.ipynb) πŸ‘ˆοΈ

![Dashboard](./public/dashboard-demo.png)

## 2. Methodology & Key Features

The ELT pipeline extracts raw data, loads it into a structured format, and then transforms it to generate key metrics and visualizations. The analysis is presented using an interactive dashboard built with Marimo, a Python library.

### Key Features:

- **Data Integration**: Combines e-commerce order data with public holiday information to analyze temporal sales patterns.
- **Data Transformation**: Cleans and prepares raw data for analysis, enabling the calculation of key performance indicators (KPIs).
- **Interactive Dashboard**: Provides a dynamic and user-friendly interface for exploring business insights.

## 3. Technology Stack

This project was built using the following technologies and libraries:

**Dashboard & Hosting:**

- [Marimo](https://github.com/marimo-team/marimo): A Python library for building interactive dashboards.
- [Hugging Face Spaces](https://huggingface.co/docs/hub/spaces-config-reference): Used for hosting and sharing the interactive dashboard.

**Data Analysis & Visualization:**

- [Pandas](https://pandas.pydata.org/): For data manipulation and analysis.
- [Plotly](https://plotly.com/python/): For creating interactive data visualizations.
- [Matplotlib](https://matplotlib.org/): For creating static visualizations.
- [Seaborn](https://seaborn.pydata.org/): For creating statistical graphics.

**Data Handling & Utilities:**

- [SQLAlchemy](https://www.sqlalchemy.org/): For interacting with databases.
- [Requests](https://requests.readthedocs.io/en/latest/): For making HTTP requests to external APIs.

**Development Tools:**

- [Ruff](https://github.com/charliermarsh/ruff): A fast Python linter and code formatter.
- [uv](https://github.com/astral-sh/uv): A fast Python package installer and resolver.

## 4. Dataset

This project utilizes the **Brazilian E-Commerce Public Dataset by Olist** from Kaggle, a public dataset containing details on over 100,000 orders. The data spans from 2016 to 2018 and includes a wide range of transactional information.

- **Source**: [Kaggle Dataset](https://www.kaggle.com/datasets/olistbr/brazilian-ecommerce)
- **Additional Data**: The project also integrates data from the [Public Holiday API](https://date.nager.at/Api).

Here is the ERD diagram for the database schema:

![ERD](./public/erd-schema.png)