men_d / README.md
JAYASREESS's picture
Update README.md
a4a29e3 verified
---
title: Credit Card Fraud Detection with DuckDB
emoji: 💳
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.4.0
python_version: '3.10'
app_file: app.py
pinned: false
---
# Credit Card Fraud Detection with DuckDB and Medallion Architecture
This project demonstrates an end-to-end pipeline for credit card fraud detection. It uses DuckDB to process data in a Medallion Architecture (Bronze, Silver, Gold) and trains a Random Forest model to identify fraudulent transactions.
## Project Structure
- `data/`: Contains the raw CSV datasets (`fraudTrain.csv`, `fraudTest.csv`).
- `src/`: Contains the Python scripts for the data pipeline and model training.
- `bronze.py`: Ingests raw data into the bronze layer.
- `silver.py`: Cleans and transforms data for the silver layer.
- `gold.py`: Creates aggregated features for the gold (analytics) layer.
- `train.py`: Trains a `RandomForestClassifier` on the gold data and saves the model.
- `models/`: Directory where the trained model is saved.
- `requirements.txt`: Lists the required Python packages.
## How to Run
1. **Install dependencies:**
```bash
pip install -r requirements.txt