You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

PiggyBank Transaction Category Classifier

This model classifies financial transactions into spending categories for use in the PiggyBank budgeting app. It was trained on the 7.4M transaction dataset from Kaggle:

πŸ”— https://www.kaggle.com/datasets/ismetsemedov/transactions/

The dataset includes transactions from 12 countries:

Country
Nigeria
Germany
Brazil
USA
Canada
Singapore
France
Australia
UK
Mexico
Japan
Russia

🧠 Model Architecture

This is a scikit-learn pipeline consisting of:

TfidfVectorizer β€” transforms transaction text (merchant, type, city, country)

LogisticRegression β€” multi-class category classifier

The model predicts categories such as:

Travel

Restaurant

Entertainment

Shopping

Groceries

And more…

🎯 Intended Use

Given a transaction description string (e.g., merchant name, location, or contextual data), the model outputs a predicted spending category and its confidence score.

Trained Text Features

Only the following text fields from the Kaggle dataset were used:

merchant

merchant_type

merchant_category

city

country

Note: The Kaggle dataset is not included in this repository. Only the trained model is hosted here.

πŸ” Example Inference

Input:

TIM HORTONS #1234 CALGARY AB

Output:
  {
    "category": "Restaurant",
    "confidence": 0.78
  }

πŸ“¦ Repository Files

model.joblib β€” scikit-learn pipeline (TF-IDF + LogisticRegression)

config.json β€” model metadata

requirements.txt β€” Python dependencies

⚠️ Limitations

The model is trained on synthetic or anonymized Kaggle data and may not perfectly reflect real banking transactions.

Accuracy may vary across countries and merchant formats.

Should not be used for regulatory, auditing, or high-stakes financial decision-making without additional evaluation.

πŸ“„ License & Data Usage

Model License: Choose one (e.g., MIT, Apache 2.0) Training Data: Kaggle dataset ismetsemedov/transactions Please refer to the dataset’s Kaggle page for terms of use.

This model is intended for educational and experimental purposes.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support