JAYASREESS commited on
Commit
794fb5d
·
verified ·
1 Parent(s): b531b77

Upload 2 files

Browse files
Files changed (2) hide show
  1. README.md +32 -0
  2. app.py +6 -0
README.md ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Credit Card Fraud Detection with DuckDB
3
+ emoji: 💳
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: gradio
7
+ sdk_version: "4.44.0"
8
+ python_version: "3.10"
9
+ app_file: app.py
10
+ pinned: false
11
+ ---
12
+
13
+ # Credit Card Fraud Detection with DuckDB and Medallion Architecture
14
+
15
+ This project demonstrates an end-to-end pipeline for credit card fraud detection. It uses DuckDB to process data in a Medallion Architecture (Bronze, Silver, Gold) and trains a Random Forest model to identify fraudulent transactions.
16
+
17
+ ## Project Structure
18
+
19
+ - `data/`: Contains the raw CSV datasets (`fraudTrain.csv`, `fraudTest.csv`).
20
+ - `src/`: Contains the Python scripts for the data pipeline and model training.
21
+ - `bronze.py`: Ingests raw data into the bronze layer.
22
+ - `silver.py`: Cleans and transforms data for the silver layer.
23
+ - `gold.py`: Creates aggregated features for the gold (analytics) layer.
24
+ - `train.py`: Trains a `RandomForestClassifier` on the gold data and saves the model.
25
+ - `models/`: Directory where the trained model is saved.
26
+ - `requirements.txt`: Lists the required Python packages.
27
+
28
+ ## How to Run
29
+
30
+ 1. **Install dependencies:**
31
+ ```bash
32
+ pip install -r requirements.txt
app.py ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+
3
+ def status():
4
+ return "Credit Card Fraud Detection Pipeline is ready. Run training using src/train.py"
5
+
6
+ gr.Interface(fn=status, inputs=[], outputs="text").launch()