Spaces:
Sleeping
title: E-Commerce ELT
emoji: π
colorFrom: indigo
colorTo: purple
sdk: docker
pinned: true
license: mit
short_description: Extract, Load, Transform Pipeline applied to an E-Commerce
π¦ E-Commerce ELT Pipeline
Table of Contents
1. Project Description
This project showcases an Extract, Load, and Transform (ELT) pipeline applied to a real-world e-commerce dataset. The primary goal is to extract valuable business insights from transactional data and present them through an interactive dashboard. The pipeline integrates data from the Brazilian E-Commerce Public Dataset by Olist, which contains over 100,000 orders from 2016 to 2018, and also incorporates data from the Public Holiday API to analyze sales performance during national holidays.
The dashboard provides a detailed view of the e-commerce experience, including:
- Order status, prices, and payment types
- Freight and delivery performance
- Customer locations and product categories
- Customer reviews and satisfaction
- Check out the deployed app here: ποΈ E-Commerce ELT ποΈ
- Check out the Jupyter Notebook for a detailed walkthrough of the project here: ποΈ Jupyter Notebook ποΈ
2. Methodology & Key Features
The ELT pipeline extracts raw data, loads it into a structured format, and then transforms it to generate key metrics and visualizations. The analysis is presented using an interactive dashboard built with Marimo, a Python library.
Key Features:
- Data Integration: Combines e-commerce order data with public holiday information to analyze temporal sales patterns.
- Data Transformation: Cleans and prepares raw data for analysis, enabling the calculation of key performance indicators (KPIs).
- Interactive Dashboard: Provides a dynamic and user-friendly interface for exploring business insights.
3. Technology Stack
This project was built using the following technologies and libraries:
Dashboard & Hosting:
- Marimo: A Python library for building interactive dashboards.
- Hugging Face Spaces: Used for hosting and sharing the interactive dashboard.
Data Analysis & Visualization:
- Pandas: For data manipulation and analysis.
- Plotly: For creating interactive data visualizations.
- Matplotlib: For creating static visualizations.
- Seaborn: For creating statistical graphics.
Data Handling & Utilities:
- SQLAlchemy: For interacting with databases.
- Requests: For making HTTP requests to external APIs.
Development Tools:
4. Dataset
This project utilizes the Brazilian E-Commerce Public Dataset by Olist from Kaggle, a public dataset containing details on over 100,000 orders. The data spans from 2016 to 2018 and includes a wide range of transactional information.
- Source: Kaggle Dataset
- Additional Data: The project also integrates data from the Public Holiday API.
Here is the ERD diagram for the database schema:

