πŸ’³ Expense Tracker β€” DistilBERT LoRA v2 (4 Data Sources)

πŸ“¦ Training Data

Source Type Rows
engreemali/bank-transactions-sms-datasetss Real Indian SMS (cleaned) ~1,200
kumarperiya/pan-indian-consumer-transaction-dataset Structured β†’ synthetic SMS ~600
ChatGPT synthetic_sms_5000 (fixed) Synthetic (augmented) ~3,300
ChatGPT realistic_synthetic_sms (fixed) Synthetic (realistic) ~3,200

🏷️ Categories

ID Category
0 Education
1 Entertainment
2 Food
3 Healthcare
4 Shopping
5 Transport
6 Utilities

πŸš€ Usage

from transformers import pipeline
clf = pipeline('text-classification', model='udayugale/expense-tracker-distilbert-lora-v2')
print(clf('Netmeds medicine order rs 350 confirmed. Delivery in 2 hrs'))
# [{'label': 'Healthcare', 'score': 0.95}]

πŸ”§ Fixes Applied to ChatGPT Data

  • Dropped Income and Others labels (not in expense categories)
  • Mapped Bills β†’ Utilities
  • Dropped sender column from File 2 (2,376 sender-label mismatches)
  • Augmented short texts (< 7 words) with bank SMS context wrappers
Downloads last month
-
Safetensors
Model size
67M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for udayugale/expense-tracker-distilbert-lora-v2

Adapter
(371)
this model