Spaces:
Runtime error
Runtime error
Delete README.md
Browse files
README.md
DELETED
|
@@ -1,33 +0,0 @@
|
|
| 1 |
-
# Credit Card Fraud Detection with DuckDB and Medallion Architecture
|
| 2 |
-
|
| 3 |
-
This project demonstrates an end-to-end pipeline for credit card fraud detection. It uses DuckDB to process data in a Medallion Architecture (Bronze, Silver, Gold) and trains a Random Forest model to identify fraudulent transactions.
|
| 4 |
-
|
| 5 |
-
## Project Structure
|
| 6 |
-
|
| 7 |
-
- `data/`: Contains the raw CSV datasets (`fraudTrain.csv`, `fraudTest.csv`).
|
| 8 |
-
- `src/`: Contains the Python scripts for the data pipeline and model training.
|
| 9 |
-
- `bronze.py`: Ingests raw data into the bronze layer.
|
| 10 |
-
- `silver.py`: Cleans and transforms data for the silver layer.
|
| 11 |
-
- `gold.py`: Creates aggregated features for the gold (analytics) layer.
|
| 12 |
-
- `train.py`: Trains a `RandomForestClassifier` on the gold data and saves the model.
|
| 13 |
-
- `models/`: Directory where the trained model is saved.
|
| 14 |
-
- `requirements.txt`: Lists the required Python packages.
|
| 15 |
-
|
| 16 |
-
## How to Run
|
| 17 |
-
|
| 18 |
-
1. **Install dependencies:**
|
| 19 |
-
```bash
|
| 20 |
-
pip install -r requirements.txt
|
| 21 |
-
```
|
| 22 |
-
|
| 23 |
-
2. **Run the training pipeline:**
|
| 24 |
-
This command executes the entire data pipeline (Bronze, Silver, Gold) and trains the model.
|
| 25 |
-
```bash
|
| 26 |
-
python src/train.py
|
| 27 |
-
```
|
| 28 |
-
|
| 29 |
-
## Medallion Architecture
|
| 30 |
-
|
| 31 |
-
- **Bronze Layer**: Raw, unfiltered data ingested directly from the source CSVs.
|
| 32 |
-
- **Silver Layer**: Cleaned and transformed data. Timestamps are corrected, and new features like cardholder `age` are derived.
|
| 33 |
-
- **Gold Layer**: Analytics-ready data with aggregated features (e.g., `avg_merch_spend`) suitable for machine learning.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|