purplecode's picture
Update README.md
b55afa2 verified

PiggyBank Transaction Category Classifier

This model classifies financial transactions into spending categories for use in the PiggyBank budgeting app. It was trained on the 7.4M transaction dataset from Kaggle:

๐Ÿ”— https://www.kaggle.com/datasets/ismetsemedov/transactions/

The dataset includes transactions from 12 countries:

Country
Nigeria
Germany
Brazil
USA
Canada
Singapore
France
Australia
UK
Mexico
Japan
Russia

๐Ÿง  Model Architecture

This is a scikit-learn pipeline consisting of:

TfidfVectorizer โ€” transforms transaction text (merchant, type, city, country)

LogisticRegression โ€” multi-class category classifier

The model predicts categories such as:

Travel

Restaurant

Entertainment

Shopping

Groceries

And moreโ€ฆ

๐ŸŽฏ Intended Use

Given a transaction description string (e.g., merchant name, location, or contextual data), the model outputs a predicted spending category and its confidence score.

Trained Text Features

Only the following text fields from the Kaggle dataset were used:

merchant

merchant_type

merchant_category

city

country

Note: The Kaggle dataset is not included in this repository. Only the trained model is hosted here.

๐Ÿ” Example Inference

Input:

TIM HORTONS #1234 CALGARY AB

Output:
  {
    "category": "Restaurant",
    "confidence": 0.78
  }

๐Ÿ“ฆ Repository Files

model.joblib โ€” scikit-learn pipeline (TF-IDF + LogisticRegression)

config.json โ€” model metadata

requirements.txt โ€” Python dependencies

โš ๏ธ Limitations

The model is trained on synthetic or anonymized Kaggle data and may not perfectly reflect real banking transactions.

Accuracy may vary across countries and merchant formats.

Should not be used for regulatory, auditing, or high-stakes financial decision-making without additional evaluation.

๐Ÿ“„ License & Data Usage

Model License: Choose one (e.g., MIT, Apache 2.0) Training Data: Kaggle dataset ismetsemedov/transactions Please refer to the datasetโ€™s Kaggle page for terms of use.

This model is intended for educational and experimental purposes.