Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
6.9.0
metadata
title: Credit Card Fraud Detection with DuckDB
emoji: 💳
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.4.0
python_version: '3.10'
app_file: app.py
pinned: false
Credit Card Fraud Detection with DuckDB and Medallion Architecture
This project demonstrates an end-to-end pipeline for credit card fraud detection. It uses DuckDB to process data in a Medallion Architecture (Bronze, Silver, Gold) and trains a Random Forest model to identify fraudulent transactions.
Project Structure
data/: Contains the raw CSV datasets (fraudTrain.csv,fraudTest.csv).src/: Contains the Python scripts for the data pipeline and model training.bronze.py: Ingests raw data into the bronze layer.silver.py: Cleans and transforms data for the silver layer.gold.py: Creates aggregated features for the gold (analytics) layer.train.py: Trains aRandomForestClassifieron the gold data and saves the model.
models/: Directory where the trained model is saved.requirements.txt: Lists the required Python packages.
How to Run
- Install dependencies:
pip install -r requirements.txt