| # PiggyBank Transaction Category Classifier | |
| This model classifies financial transactions into spending categories for use in the PiggyBank budgeting app. | |
| It was trained on the 7.4M transaction dataset from Kaggle: | |
| 🔗 https://www.kaggle.com/datasets/ismetsemedov/transactions/ | |
| The dataset includes transactions from 12 countries: | |
| | Country | | |
| -------------- | |
| Nigeria | |
| Germany | |
| Brazil | |
| USA | |
| Canada | |
| Singapore | |
| France | |
| Australia | |
| UK | |
| Mexico | |
| Japan | |
| Russia | |
| ## 🧠 Model Architecture | |
| This is a scikit-learn pipeline consisting of: | |
| TfidfVectorizer — transforms transaction text (merchant, type, city, country) | |
| LogisticRegression — multi-class category classifier | |
| The model predicts categories such as: | |
| Travel | |
| Restaurant | |
| Entertainment | |
| Shopping | |
| Groceries | |
| And more… | |
| ## 🎯 Intended Use | |
| Given a transaction description string (e.g., merchant name, location, or contextual data), the model outputs a predicted spending category and its confidence score. | |
| Trained Text Features | |
| Only the following text fields from the Kaggle dataset were used: | |
| merchant | |
| merchant_type | |
| merchant_category | |
| city | |
| country | |
| Note: The Kaggle dataset is not included in this repository. | |
| Only the trained model is hosted here. | |
| 🔍 Example Inference | |
| Input: | |
| TIM HORTONS #1234 CALGARY AB | |
| ##### Output: | |
| ```json | |
| { | |
| "category": "Restaurant", | |
| "confidence": 0.78 | |
| } | |
| ``` | |
| ## 📦 Repository Files | |
| model.joblib — scikit-learn pipeline (TF-IDF + LogisticRegression) | |
| config.json — model metadata | |
| requirements.txt — Python dependencies | |
| ## ⚠️ Limitations | |
| The model is trained on synthetic or anonymized Kaggle data and may not perfectly reflect real banking transactions. | |
| Accuracy may vary across countries and merchant formats. | |
| Should not be used for regulatory, auditing, or high-stakes financial decision-making without additional evaluation. | |
| ## 📄 License & Data Usage | |
| Model License: Choose one (e.g., MIT, Apache 2.0) | |
| Training Data: Kaggle dataset ismetsemedov/transactions | |
| Please refer to the dataset’s Kaggle page for terms of use. | |
| This model is intended for educational and experimental purposes. |