Upload folder using huggingface_hub

Browse files

Files changed (9) hide show

README.md +176 -0
config.json +61 -0
label_mapping.json +38 -0
onnx/model_quantized.onnx +3 -0
special_tokens_map.json +7 -0
tokenizer.json +0 -0
tokenizer_config.json +56 -0
training-report.json +673 -0
vocab.txt +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,176 @@

+---
+license: apache-2.0
+tags:
+  - text-classification
+  - transformers
+  - onnx
+  - safetensors
+  - transformers.js
+  - distilbert
+  - finance
+  - transactions
+  - english
+language:
+  - en
+datasets:
+  - DoDataThings/us-bank-transaction-categories-v2
+pipeline_tag: text-classification
+---
+# DistilBERT US Bank Transaction Classifier v2
+**Successor to [v1](https://huggingface.co/DoDataThings/distilbert-us-transaction-classifier).** Sign-aware classification with expanded merchant coverage, PayPal wrapper handling, and a refined 16-category taxonomy.
+## What Changed from v1
+| | v1 | v2 |
+|---|---|---|
+| **Input format** | Description only | `[debit]` / `[credit]` prefix + full description |
+| **Categories** | 16 (includes Housing) | 16 (Housing split into Rent + Mortgage removed) |
+| **Training data** | 16,000 samples | 24,000 samples |
+| **Merchant coverage** | ~300 merchants | ~500+ merchants |
+| **PayPal awareness** | Limited | Full — PreApproved, Express Checkout, PP*, PAYPAL * |
+| **POS prefix awareness** | SQ* only | SQ*, TST* (Toast), CLV* (Clover) |
+| **Transfer patterns** | Basic | Brokerage sweeps, fintech platforms, wire, cashier's checks, ATM |
+### Why v2?
+v1 confused Income and Transfer because it only saw the description text. A "VENMO CASHOUT" deposit looks the same regardless of direction. v2 prepends `[credit]` or `[debit]` based on the transaction sign (after normalization to cardholder perspective), giving the model a strong directional signal.
+Mortgage was removed as a model category because mortgage account transactions are better classified by account type — every transaction on a mortgage account is a mortgage payment by definition.
+## What This Is (and Isn't)
+A fine-tuned DistilBERT model for classifying US bank transaction descriptions into 16 spending categories. Designed as a **fallback layer** in a multi-tier classification pipeline — not a standalone classifier.
+1. **User rules** — pattern matching catches known merchants (highest accuracy)
+2. **This model** — classifies everything else, with sign awareness
+3. **Bank-provided categories** — fallback when model confidence is low
+4. **User overrides** — manual corrections for edge cases
+## Training
+```
+Model:       DistilBERT-base-uncased + LoRA (r=32, alpha=64)
+Dataset:     24,000 synthetic samples, 1,500 per category
+Trainable:   1.8M / 68.7M parameters (2.6%)
+Training:    20 epochs, ~8 minutes on consumer GPU
+Best epoch:  17 (99.1% validation accuracy)
+```
+### Loss Curve
+| Epoch | Train Loss | Val Loss | Train Acc | Val Acc |
+|-------|-----------|----------|-----------|---------|
+| 1     | 2.590     | 1.936    | 20.8%     | 52.6%   |
+| 5     | 0.325     | 0.214    | 90.5%     | 93.9%   |
+| 10    | 0.078     | 0.055    | 97.8%     | 98.3%   |
+| 15    | 0.034     | 0.026    | 99.0%     | 99.2%   |
+| 17    | 0.029     | 0.030    | 99.1%     | 99.1%   |
+### Honest Assessment
+Validation accuracy (99.1%) is on synthetic data. Real-world performance on ~2,000 transactions:
+- **86% of model classifications at 0.90+ confidence**
+- **< 0.4% below 0.50 confidence** (9 of 2,038 transactions)
+- Income and Transfer both at 100% on synthetic validation
+- Shopping remains the weakest category (~93%) due to overlap with Subscription and Groceries
+The sign prefix resolved the Income/Transfer confusion from v1. The main remaining challenge is niche merchants the model hasn't seen — diminishing returns territory best handled by user rules.
+## Categories (16)
+| Category | What it covers |
+|----------|----------------|
+| Restaurants | Fast food, sit-down, coffee shops, food delivery, POS systems (TST*, SQ*, CLV*) |
+| Groceries | Supermarkets, warehouse clubs, farmers markets, convenience stores |
+| Shopping | Retail, online purchases, department stores, pet stores, liquor stores, e-commerce marketplaces |
+| Transportation | Gas, EV charging, rideshare, auto maintenance, parking, tolls, DMV |
+| Entertainment | Movies, events, gaming (Steam, PlayStation), gambling/sportsbooks |
+| Utilities | Electric, internet, phone, water, waste/trash, solar |
+| Subscription | Streaming, SaaS, AI tools, VPNs, social media premium, dating apps, news |
+| Healthcare | Pharmacy, doctor, dentist, telehealth, vision, hospital |
+| Insurance | Auto, home, health, life insurance |
+| Rent | Property management companies, lease payments |
+| Travel | Hotels, airlines, car rental, cruise lines, airport services |
+| Education | Online courses, tutoring, books, tuition, certification |
+| Personal Care | Salon, gym, beauty, spa, barber |
+| Transfer | CC autopay, Zelle/Venmo sends, bank transfers, brokerage sweeps, BNPL, wire transfers, ATM, cashier's checks |
+| Income | Payroll, direct deposit, interest, refunds, government benefits, gig economy payouts |
+| Fees | Bank fees, late fees, service charges, ATM fees |
+### Account-Type-Implied Categories (not model-classified)
+These categories are determined by the account type, not the model:
+| Account Type | Category |
+|---|---|
+| Mortgage | Mortgage |
+| Auto Loan | Transportation |
+| Student Loan | Education |
+| Personal Loan | Transfer |
+| HELOC | Transfer |
+| CD | Income |
+## Usage
+### Python
+```python
+from transformers import pipeline
+classifier = pipeline("text-classification", model="DoDataThings/distilbert-us-transaction-classifier-v2")
+# v2 requires sign prefix
+result = classifier("[debit] STARBUCKS #1234 SAN FRANCISCO CA")
+print(result)  # [{'label': 'Restaurants', 'score': 0.98}]
+# Sign matters for ambiguous transactions
+classifier("[credit] VENMO CASHOUT PPD ID: 12345678")
+# [{'label': 'Income', 'score': 0.95}]
+classifier("[debit] VENMO PAYMENT TO JOHN SMITH")
+# [{'label': 'Transfer', 'score': 0.97}]
+```
+### JavaScript (Transformers.js)
+```javascript
+const { pipeline } = require('@xenova/transformers');
+const classifier = await pipeline(
+  'text-classification',
+  'DoDataThings/distilbert-us-transaction-classifier-v2'
+);
+const result = await classifier('[debit] STARBUCKS #1234');
+// [{ label: 'Restaurants', score: 0.98 }]
+```
+An ONNX export is included in the `onnx/` subdirectory.
+### Sign Prefix Convention
+Prepend `[credit]` or `[debit]` based on the **normalized** transaction amount (cardholder perspective):
+- `[debit]` — money left the account (purchases, payments out, fees)
+- `[credit]` — money entered the account (income, refunds, payments received)
+If your data uses issuer perspective (e.g., Apple Card where purchases are positive), normalize the sign first, then apply the prefix.
+## Training Data
+The synthetic dataset is published at [`DoDataThings/us-bank-transaction-categories-v2`](https://huggingface.co/datasets/DoDataThings/us-bank-transaction-categories-v2). The generator script is open source — you can extend the merchant pools, add format templates, or increase sample counts.
+## Limitations
+- **US bank formats only** — Trained on Chase, Apple Card, PayPal, Capital One, and US Bank statement patterns
+- **Synthetic training data** — May miss patterns from banks not represented
+- **Shopping is the weakest category** (~93%) due to overlap with Subscription and Groceries
+- **Niche merchants** may classify with low confidence — use merchant rules for known edge cases
+- **Sign prefix required** — The model expects `[debit]` or `[credit]` prefix. Passing raw descriptions without the prefix will degrade accuracy.
+- **Not a standalone solution** — Best results come from combining with merchant rules and account-type-implied classifications
+## License
+Apache 2.0

config.json ADDED Viewed

	@@ -0,0 +1,61 @@

+{
+  "_name_or_path": "data/models/foliome-classifier-v2\\",
+  "activation": "gelu",
+  "architectures": [
+    "DistilBertForSequenceClassification"
+  ],
+  "attention_dropout": 0.1,
+  "dim": 768,
+  "dropout": 0.1,
+  "hidden_dim": 3072,
+  "id2label": {
+    "0": "Education",
+    "1": "Entertainment",
+    "2": "Fees",
+    "3": "Groceries",
+    "4": "Healthcare",
+    "5": "Income",
+    "6": "Insurance",
+    "7": "Personal Care",
+    "8": "Rent",
+    "9": "Restaurants",
+    "10": "Shopping",
+    "11": "Subscription",
+    "12": "Transfer",
+    "13": "Transportation",
+    "14": "Travel",
+    "15": "Utilities"
+  },
+  "initializer_range": 0.02,
+  "label2id": {
+    "Education": 0,
+    "Entertainment": 1,
+    "Fees": 2,
+    "Groceries": 3,
+    "Healthcare": 4,
+    "Income": 5,
+    "Insurance": 6,
+    "Personal Care": 7,
+    "Rent": 8,
+    "Restaurants": 9,
+    "Shopping": 10,
+    "Subscription": 11,
+    "Transfer": 12,
+    "Transportation": 13,
+    "Travel": 14,
+    "Utilities": 15
+  },
+  "max_position_embeddings": 512,
+  "model_type": "distilbert",
+  "n_heads": 12,
+  "n_layers": 6,
+  "pad_token_id": 0,
+  "problem_type": "single_label_classification",
+  "qa_dropout": 0.1,
+  "seq_classif_dropout": 0.2,
+  "sinusoidal_pos_embds": false,
+  "tie_weights_": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.49.0",
+  "vocab_size": 30522
+}

label_mapping.json ADDED Viewed

	@@ -0,0 +1,38 @@

+{
+  "id2label": {
+    "0": "Education",
+    "1": "Entertainment",
+    "2": "Fees",
+    "3": "Groceries",
+    "4": "Healthcare",
+    "5": "Income",
+    "6": "Insurance",
+    "7": "Personal Care",
+    "8": "Rent",
+    "9": "Restaurants",
+    "10": "Shopping",
+    "11": "Subscription",
+    "12": "Transfer",
+    "13": "Transportation",
+    "14": "Travel",
+    "15": "Utilities"
+  },
+  "label2id": {
+    "Education": 0,
+    "Entertainment": 1,
+    "Fees": 2,
+    "Groceries": 3,
+    "Healthcare": 4,
+    "Income": 5,
+    "Insurance": 6,
+    "Personal Care": 7,
+    "Rent": 8,
+    "Restaurants": 9,
+    "Shopping": 10,
+    "Subscription": 11,
+    "Transfer": 12,
+    "Transportation": 13,
+    "Travel": 14,
+    "Utilities": 15
+  }
+}

onnx/model_quantized.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:af4d35409501558e9112dbc5aef014f0c8086427d00b4950d629f497d20d54fd
+size 267975237

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "cls_token": "[CLS]",
+  "mask_token": "[MASK]",
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "unk_token": "[UNK]"
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,56 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "100": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "101": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "102": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "103": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": false,
+  "cls_token": "[CLS]",
+  "do_lower_case": true,
+  "extra_special_tokens": {},
+  "mask_token": "[MASK]",
+  "model_max_length": 512,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "DistilBertTokenizer",
+  "unk_token": "[UNK]"
+}

training-report.json ADDED Viewed

	@@ -0,0 +1,673 @@

+{
+  "model": "distilbert-base-uncased + LoRA (r=32, alpha=64)",
+  "dataset": "24000 synthetic transactions, 16 categories, 1500/category",
+  "split": "20400 train / 3600 val (85/15)",
+  "epochs": 20,
+  "best_epoch": 18,
+  "best_val_accuracy": 0.9908,
+  "total_training_time_s": 458.7,
+  "device": "cuda",
+  "trainable_params": 1782544,
+  "total_params": 68748320,
+  "categories": [
+    "Education",
+    "Entertainment",
+    "Fees",
+    "Groceries",
+    "Healthcare",
+    "Income",
+    "Insurance",
+    "Personal Care",
+    "Rent",
+    "Restaurants",
+    "Shopping",
+    "Subscription",
+    "Transfer",
+    "Transportation",
+    "Travel",
+    "Utilities"
+  ],
+  "history": [
+    {
+      "epoch": 1,
+      "train_loss": 2.5896,
+      "val_loss": 1.9356,
+      "train_acc": 0.2075,
+      "val_acc": 0.5256,
+      "per_category": {
+        "Education": 0.362,
+        "Entertainment": 0.142,
+        "Fees": 1.0,
+        "Groceries": 0.575,
+        "Healthcare": 0.798,
+        "Income": 0.964,
+        "Insurance": 0.273,
+        "Personal Care": 0.203,
+        "Rent": 0.97,
+        "Restaurants": 0.544,
+        "Shopping": 0.36,
+        "Subscription": 0.802,
+        "Transfer": 0.166,
+        "Transportation": 0.5,
+        "Travel": 0.241,
+        "Utilities": 0.479
+      },
+      "epoch_time_s": 23.7
+    },
+    {
+      "epoch": 2,
+      "train_loss": 1.3952,
+      "val_loss": 0.8527,
+      "train_acc": 0.6386,
+      "val_acc": 0.7642,
+      "per_category": {
+        "Education": 0.819,
+        "Entertainment": 0.652,
+        "Fees": 1.0,
+        "Groceries": 0.617,
+        "Healthcare": 0.88,
+        "Income": 1.0,
+        "Insurance": 0.745,
+        "Personal Care": 0.586,
+        "Rent": 0.991,
+        "Restaurants": 0.628,
+        "Shopping": 0.671,
+        "Subscription": 0.718,
+        "Transfer": 0.751,
+        "Transportation": 0.71,
+        "Travel": 0.565,
+        "Utilities": 0.853
+      },
+      "epoch_time_s": 23.2
+    },
+    {
+      "epoch": 3,
+      "train_loss": 0.7496,
+      "val_loss": 0.4969,
+      "train_acc": 0.7876,
+      "val_acc": 0.8619,
+      "per_category": {
+        "Education": 0.95,
+        "Entertainment": 0.826,
+        "Fees": 1.0,
+        "Groceries": 0.696,
+        "Healthcare": 0.93,
+        "Income": 0.982,
+        "Insurance": 0.892,
+        "Personal Care": 0.918,
+        "Rent": 1.0,
+        "Restaurants": 0.686,
+        "Shopping": 0.778,
+        "Subscription": 0.861,
+        "Transfer": 0.889,
+        "Transportation": 0.694,
+        "Travel": 0.728,
+        "Utilities": 0.943
+      },
+      "epoch_time_s": 23.5
+    },
+    {
+      "epoch": 4,
+      "train_loss": 0.4773,
+      "val_loss": 0.2947,
+      "train_acc": 0.8612,
+      "val_acc": 0.9169,
+      "per_category": {
+        "Education": 0.977,
+        "Entertainment": 0.939,
+        "Fees": 1.0,
+        "Groceries": 0.734,
+        "Healthcare": 0.946,
+        "Income": 1.0,
+        "Insurance": 0.944,
+        "Personal Care": 0.991,
+        "Rent": 1.0,
+        "Restaurants": 0.863,
+        "Shopping": 0.796,
+        "Subscription": 0.842,
+        "Transfer": 0.926,
+        "Transportation": 0.863,
+        "Travel": 0.869,
+        "Utilities": 0.953
+      },
+      "epoch_time_s": 23.4
+    },
+    {
+      "epoch": 5,
+      "train_loss": 0.325,
+      "val_loss": 0.2142,
+      "train_acc": 0.9049,
+      "val_acc": 0.9392,
+      "per_category": {
+        "Education": 0.982,
+        "Entertainment": 0.951,
+        "Fees": 1.0,
+        "Groceries": 0.883,
+        "Healthcare": 0.967,
+        "Income": 1.0,
+        "Insurance": 0.97,
+        "Personal Care": 1.0,
+        "Rent": 1.0,
+        "Restaurants": 0.881,
+        "Shopping": 0.804,
+        "Subscription": 0.866,
+        "Transfer": 0.917,
+        "Transportation": 0.891,
+        "Travel": 0.921,
+        "Utilities": 0.976
+      },
+      "epoch_time_s": 23.4
+    },
+    {
+      "epoch": 6,
+      "train_loss": 0.2342,
+      "val_loss": 0.1447,
+      "train_acc": 0.9325,
+      "val_acc": 0.9575,
+      "per_category": {
+        "Education": 0.982,
+        "Entertainment": 0.988,
+        "Fees": 1.0,
+        "Groceries": 0.883,
+        "Healthcare": 0.967,
+        "Income": 1.0,
+        "Insurance": 0.939,
+        "Personal Care": 1.0,
+        "Rent": 1.0,
+        "Restaurants": 0.947,
+        "Shopping": 0.836,
+        "Subscription": 0.941,
+        "Transfer": 0.982,
+        "Transportation": 0.907,
+        "Travel": 0.948,
+        "Utilities": 0.995
+      },
+      "epoch_time_s": 23.4
+    },
+    {
+      "epoch": 7,
+      "train_loss": 0.1724,
+      "val_loss": 0.1079,
+      "train_acc": 0.9508,
+      "val_acc": 0.9708,
+      "per_category": {
+        "Education": 0.982,
+        "Entertainment": 0.996,
+        "Fees": 1.0,
+        "Groceries": 0.953,
+        "Healthcare": 0.967,
+        "Income": 1.0,
+        "Insurance": 0.991,
+        "Personal Care": 1.0,
+        "Rent": 1.0,
+        "Restaurants": 0.942,
+        "Shopping": 0.844,
+        "Subscription": 0.926,
+        "Transfer": 0.982,
+        "Transportation": 0.972,
+        "Travel": 0.979,
+        "Utilities": 0.991
+      },
+      "epoch_time_s": 23.5
+    },
+    {
+      "epoch": 8,
+      "train_loss": 0.1254,
+      "val_loss": 0.0775,
+      "train_acc": 0.9649,
+      "val_acc": 0.9789,
+      "per_category": {
+        "Education": 1.0,
+        "Entertainment": 0.988,
+        "Fees": 1.0,
+        "Groceries": 0.967,
+        "Healthcare": 0.967,
+        "Income": 1.0,
+        "Insurance": 0.996,
+        "Personal Care": 1.0,
+        "Rent": 1.0,
+        "Restaurants": 0.96,
+        "Shopping": 0.893,
+        "Subscription": 0.95,
+        "Transfer": 0.982,
+        "Transportation": 0.976,
+        "Travel": 0.995,
+        "Utilities": 0.986
+      },
+      "epoch_time_s": 23.5
+    },
+    {
+      "epoch": 9,
+      "train_loss": 0.0972,
+      "val_loss": 0.0627,
+      "train_acc": 0.9723,
+      "val_acc": 0.9814,
+      "per_category": {
+        "Education": 0.995,
+        "Entertainment": 0.992,
+        "Fees": 1.0,
+        "Groceries": 0.981,
+        "Healthcare": 0.967,
+        "Income": 1.0,
+        "Insurance": 0.996,
+        "Personal Care": 1.0,
+        "Rent": 1.0,
+        "Restaurants": 0.96,
+        "Shopping": 0.884,
+        "Subscription": 0.96,
+        "Transfer": 0.982,
+        "Transportation": 0.988,
+        "Travel": 0.995,
+        "Utilities": 1.0
+      },
+      "epoch_time_s": 22.4
+    },
+    {
+      "epoch": 10,
+      "train_loss": 0.0771,
+      "val_loss": 0.0557,
+      "train_acc": 0.9789,
+      "val_acc": 0.985,
+      "per_category": {
+        "Education": 1.0,
+        "Entertainment": 1.0,
+        "Fees": 1.0,
+        "Groceries": 1.0,
+        "Healthcare": 0.963,
+        "Income": 1.0,
+        "Insurance": 0.996,
+        "Personal Care": 1.0,
+        "Rent": 1.0,
+        "Restaurants": 0.978,
+        "Shopping": 0.889,
+        "Subscription": 0.955,
+        "Transfer": 0.995,
+        "Transportation": 0.988,
+        "Travel": 0.995,
+        "Utilities": 1.0
+      },
+      "epoch_time_s": 22.6
+    },
+    {
+      "epoch": 11,
+      "train_loss": 0.0665,
+      "val_loss": 0.0485,
+      "train_acc": 0.9812,
+      "val_acc": 0.9864,
+      "per_category": {
+        "Education": 1.0,
+        "Entertainment": 1.0,
+        "Fees": 1.0,
+        "Groceries": 0.991,
+        "Healthcare": 0.967,
+        "Income": 1.0,
+        "Insurance": 0.996,
+        "Personal Care": 1.0,
+        "Rent": 1.0,
+        "Restaurants": 0.987,
+        "Shopping": 0.898,
+        "Subscription": 0.975,
+        "Transfer": 0.982,
+        "Transportation": 0.988,
+        "Travel": 1.0,
+        "Utilities": 1.0
+      },
+      "epoch_time_s": 22.6
+    },
+    {
+      "epoch": 12,
+      "train_loss": 0.0534,
+      "val_loss": 0.0404,
+      "train_acc": 0.9841,
+      "val_acc": 0.9872,
+      "per_category": {
+        "Education": 1.0,
+        "Entertainment": 1.0,
+        "Fees": 1.0,
+        "Groceries": 0.986,
+        "Healthcare": 0.963,
+        "Income": 1.0,
+        "Insurance": 0.996,
+        "Personal Care": 0.991,
+        "Rent": 1.0,
+        "Restaurants": 0.991,
+        "Shopping": 0.907,
+        "Subscription": 0.98,
+        "Transfer": 0.995,
+        "Transportation": 0.988,
+        "Travel": 1.0,
+        "Utilities": 1.0
+      },
+      "epoch_time_s": 22.6
+    },
+    {
+      "epoch": 13,
+      "train_loss": 0.0463,
+      "val_loss": 0.0418,
+      "train_acc": 0.9857,
+      "val_acc": 0.9889,
+      "per_category": {
+        "Education": 1.0,
+        "Entertainment": 1.0,
+        "Fees": 1.0,
+        "Groceries": 1.0,
+        "Healthcare": 0.967,
+        "Income": 1.0,
+        "Insurance": 0.996,
+        "Personal Care": 1.0,
+        "Rent": 1.0,
+        "Restaurants": 0.991,
+        "Shopping": 0.898,
+        "Subscription": 0.975,
+        "Transfer": 0.995,
+        "Transportation": 1.0,
+        "Travel": 1.0,
+        "Utilities": 1.0
+      },
+      "epoch_time_s": 22.3
+    },
+    {
+      "epoch": 14,
+      "train_loss": 0.0421,
+      "val_loss": 0.0386,
+      "train_acc": 0.9872,
+      "val_acc": 0.9889,
+      "per_category": {
+        "Education": 1.0,
+        "Entertainment": 1.0,
+        "Fees": 1.0,
+        "Groceries": 0.995,
+        "Healthcare": 0.967,
+        "Income": 1.0,
+        "Insurance": 1.0,
+        "Personal Care": 1.0,
+        "Rent": 1.0,
+        "Restaurants": 0.991,
+        "Shopping": 0.911,
+        "Subscription": 0.98,
+        "Transfer": 0.991,
+        "Transportation": 0.988,
+        "Travel": 1.0,
+        "Utilities": 1.0
+      },
+      "epoch_time_s": 22.3
+    },
+    {
+      "epoch": 15,
+      "train_loss": 0.0378,
+      "val_loss": 0.0341,
+      "train_acc": 0.9886,
+      "val_acc": 0.9892,
+      "per_category": {
+        "Education": 1.0,
+        "Entertainment": 1.0,
+        "Fees": 1.0,
+        "Groceries": 0.995,
+        "Healthcare": 0.971,
+        "Income": 1.0,
+        "Insurance": 1.0,
+        "Personal Care": 1.0,
+        "Rent": 1.0,
+        "Restaurants": 0.991,
+        "Shopping": 0.907,
+        "Subscription": 0.98,
+        "Transfer": 0.995,
+        "Transportation": 0.988,
+        "Travel": 1.0,
+        "Utilities": 1.0
+      },
+      "epoch_time_s": 22.4
+    },
+    {
+      "epoch": 16,
+      "train_loss": 0.0319,
+      "val_loss": 0.0363,
+      "train_acc": 0.9912,
+      "val_acc": 0.9894,
+      "per_category": {
+        "Education": 1.0,
+        "Entertainment": 1.0,
+        "Fees": 1.0,
+        "Groceries": 1.0,
+        "Healthcare": 0.967,
+        "Income": 1.0,
+        "Insurance": 1.0,
+        "Personal Care": 1.0,
+        "Rent": 1.0,
+        "Restaurants": 0.991,
+        "Shopping": 0.902,
+        "Subscription": 0.985,
+        "Transfer": 0.995,
+        "Transportation": 0.992,
+        "Travel": 1.0,
+        "Utilities": 1.0
+      },
+      "epoch_time_s": 22.4
+    },
+    {
+      "epoch": 17,
+      "train_loss": 0.0288,
+      "val_loss": 0.0296,
+      "train_acc": 0.9913,
+      "val_acc": 0.9906,
+      "per_category": {
+        "Education": 1.0,
+        "Entertainment": 1.0,
+        "Fees": 1.0,
+        "Groceries": 0.986,
+        "Healthcare": 0.971,
+        "Income": 1.0,
+        "Insurance": 0.996,
+        "Personal Care": 0.991,
+        "Rent": 1.0,
+        "Restaurants": 0.991,
+        "Shopping": 0.947,
+        "Subscription": 0.985,
+        "Transfer": 0.995,
+        "Transportation": 0.988,
+        "Travel": 1.0,
+        "Utilities": 1.0
+      },
+      "epoch_time_s": 22.3
+    },
+    {
+      "epoch": 18,
+      "train_loss": 0.0255,
+      "val_loss": 0.0284,
+      "train_acc": 0.993,
+      "val_acc": 0.9908,
+      "per_category": {
+        "Education": 1.0,
+        "Entertainment": 1.0,
+        "Fees": 1.0,
+        "Groceries": 0.995,
+        "Healthcare": 0.971,
+        "Income": 1.0,
+        "Insurance": 0.996,
+        "Personal Care": 1.0,
+        "Rent": 1.0,
+        "Restaurants": 0.991,
+        "Shopping": 0.929,
+        "Subscription": 0.985,
+        "Transfer": 0.995,
+        "Transportation": 0.992,
+        "Travel": 1.0,
+        "Utilities": 1.0
+      },
+      "epoch_time_s": 22.4
+    },
+    {
+      "epoch": 19,
+      "train_loss": 0.0273,
+      "val_loss": 0.0306,
+      "train_acc": 0.9912,
+      "val_acc": 0.9897,
+      "per_category": {
+        "Education": 1.0,
+        "Entertainment": 1.0,
+        "Fees": 1.0,
+        "Groceries": 1.0,
+        "Healthcare": 0.967,
+        "Income": 1.0,
+        "Insurance": 0.996,
+        "Personal Care": 1.0,
+        "Rent": 1.0,
+        "Restaurants": 0.991,
+        "Shopping": 0.907,
+        "Subscription": 0.98,
+        "Transfer": 1.0,
+        "Transportation": 0.996,
+        "Travel": 1.0,
+        "Utilities": 1.0
+      },
+      "epoch_time_s": 22.5
+    },
+    {
+      "epoch": 20,
+      "train_loss": 0.023,
+      "val_loss": 0.03,
+      "train_acc": 0.9928,
+      "val_acc": 0.9906,
+      "per_category": {
+        "Education": 1.0,
+        "Entertainment": 1.0,
+        "Fees": 1.0,
+        "Groceries": 1.0,
+        "Healthcare": 0.967,
+        "Income": 1.0,
+        "Insurance": 0.996,
+        "Personal Care": 1.0,
+        "Rent": 1.0,
+        "Restaurants": 0.991,
+        "Shopping": 0.916,
+        "Subscription": 0.985,
+        "Transfer": 1.0,
+        "Transportation": 0.996,
+        "Travel": 1.0,
+        "Utilities": 1.0
+      },
+      "epoch_time_s": 23.2
+    }
+  ],
+  "final_per_category": {
+    "Education": {
+      "accuracy": 1.0,
+      "correct": 221,
+      "total": 221,
+      "top_confusions": {}
+    },
+    "Entertainment": {
+      "accuracy": 1.0,
+      "correct": 247,
+      "total": 247,
+      "top_confusions": {}
+    },
+    "Fees": {
+      "accuracy": 1.0,
+      "correct": 240,
+      "total": 240,
+      "top_confusions": {}
+    },
+    "Groceries": {
+      "accuracy": 0.995,
+      "correct": 213,
+      "total": 214,
+      "top_confusions": {
+        "Shopping": 1
+      }
+    },
+    "Healthcare": {
+      "accuracy": 0.971,
+      "correct": 235,
+      "total": 242,
+      "top_confusions": {
+        "Education": 2,
+        "Utilities": 2,
+        "Insurance": 1
+      }
+    },
+    "Income": {
+      "accuracy": 1.0,
+      "correct": 221,
+      "total": 221,
+      "top_confusions": {}
+    },
+    "Insurance": {
+      "accuracy": 0.996,
+      "correct": 230,
+      "total": 231,
+      "top_confusions": {
+        "Income": 1
+      }
+    },
+    "Personal Care": {
+      "accuracy": 1.0,
+      "correct": 232,
+      "total": 232,
+      "top_confusions": {}
+    },
+    "Rent": {
+      "accuracy": 1.0,
+      "correct": 232,
+      "total": 232,
+      "top_confusions": {}
+    },
+    "Restaurants": {
+      "accuracy": 0.991,
+      "correct": 224,
+      "total": 226,
+      "top_confusions": {
+        "Groceries": 2
+      }
+    },
+    "Shopping": {
+      "accuracy": 0.929,
+      "correct": 209,
+      "total": 225,
+      "top_confusions": {
+        "Personal Care": 4,
+        "Restaurants": 3,
+        "Travel": 3
+      }
+    },
+    "Subscription": {
+      "accuracy": 0.985,
+      "correct": 199,
+      "total": 202,
+      "top_confusions": {
+        "Education": 1,
+        "Personal Care": 1,
+        "Shopping": 1
+      }
+    },
+    "Transfer": {
+      "accuracy": 0.995,
+      "correct": 216,
+      "total": 217,
+      "top_confusions": {
+        "Shopping": 1
+      }
+    },
+    "Transportation": {
+      "accuracy": 0.992,
+      "correct": 246,
+      "total": 248,
+      "top_confusions": {
+        "Shopping": 1,
+        "Subscription": 1
+      }
+    },
+    "Travel": {
+      "accuracy": 1.0,
+      "correct": 191,
+      "total": 191,
+      "top_confusions": {}
+    },
+    "Utilities": {
+      "accuracy": 1.0,
+      "correct": 211,
+      "total": 211,
+      "top_confusions": {}
+    }
+  }
+}

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff