finmigodeveloper's picture
Upload README.md with huggingface_hub
960863b verified
---
language: en
tags:
- transaction-categorization
- distilbert
- lora
- peft
- finance
- text-classification
datasets:
- mitulshah/transaction-categorization
license: apache-2.0
---
# Transaction Category Classifier - LoRA Version
This is a **LoRA adapter** for DistilBERT that classifies bank transactions into 10 categories with **98.53% accuracy**.
## Model Details
- **Base Model:** [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased)
- **Fine-tuned Model:** [finmigodeveloper/distilbert-transaction-classifier](https://huggingface.co/finmigodeveloper/distilbert-transaction-classifier)
- **Adapter Size:** ~2.5 MB (98.7% smaller than full model)
- **Categories:** 10 transaction types
## Performance
| Metric | Value |
|--------|-------|
| Accuracy | 98.53% |
| Loss | 0.0221 |
| Training Samples | 80,000 |
| Validation Samples | 20,000 |
## Categories
- Charity & Donations
- Entertainment & Recreation
- Financial Services
- Food & Dining
- Government & Legal
- Healthcare & Medical
- Income
- Shopping & Retail
- Transportation
- Utilities & Services
## How to Use
```python
from transformers import pipeline
# Load directly
classifier = pipeline("text-classification",
model="finmigodeveloper/distilbert-transaction-classifier-lora")
# Test it
transactions = [
"Starbucks coffee",
"Monthly salary deposit",
"Uber ride to airport"
]
for text in transactions:
result = classifier(text)[0]
print(f"{text}: {result['label']} ({result['score']:.2%})")
```
## Training Details
- **LoRA Rank (r):** 8
- **LoRA Alpha:** 16
- **Target Modules:** q_lin, k_lin, v_lin, out_lin
- **Dropout:** 0.1
- **Epochs:** 3
- **Batch Size:** 64
- **Learning Rate:** 2e-5
## Why LoRA?
- **98.7% smaller** than the full model
- **Faster loading** (~0.3 seconds vs 2-3 seconds)
- **Same accuracy** as the full model
- Perfect for **mobile apps** and **edge deployment**
## Files in this repository
- `adapter_model.safetensors`: The LoRA adapter weights (2.5 MB)
- `adapter_config.json`: LoRA configuration
- `training_stats.json`: Detailed training statistics
- `tokenizer.json` & `tokenizer_config.json`: Tokenizer files