---
language: en
tags:
- transaction-categorization
- distilbert
- lora
- peft
- finance
- text-classification
datasets:
- mitulshah/transaction-categorization
license: apache-2.0
---

# Transaction Category Classifier - LoRA Version

This is a **LoRA adapter** for DistilBERT that classifies bank transactions into 10 categories with **98.53% accuracy**.

## Model Details

- **Base Model:** [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased)
- **Fine-tuned Model:** [finmigodeveloper/distilbert-transaction-classifier](https://huggingface.co/finmigodeveloper/distilbert-transaction-classifier)
- **Adapter Size:** ~2.5 MB (98.7% smaller than full model)
- **Categories:** 10 transaction types

## Performance

| Metric | Value |
|--------|-------|
| Accuracy | 98.53% |
| Loss | 0.0221 |
| Training Samples | 80,000 |
| Validation Samples | 20,000 |

## Categories

- Charity & Donations
- Entertainment & Recreation
- Financial Services
- Food & Dining
- Government & Legal
- Healthcare & Medical
- Income
- Shopping & Retail
- Transportation
- Utilities & Services

## How to Use

```python
from transformers import pipeline

# Load directly
classifier = pipeline("text-classification", 
                     model="finmigodeveloper/distilbert-transaction-classifier-lora")

# Test it
transactions = [
    "Starbucks coffee",
    "Monthly salary deposit", 
    "Uber ride to airport"
]

for text in transactions:
    result = classifier(text)[0]
    print(f"{text}: {result['label']} ({result['score']:.2%})")
```

## Training Details

- **LoRA Rank (r):** 8
- **LoRA Alpha:** 16
- **Target Modules:** q_lin, k_lin, v_lin, out_lin
- **Dropout:** 0.1
- **Epochs:** 3
- **Batch Size:** 64
- **Learning Rate:** 2e-5

## Why LoRA?

- **98.7% smaller** than the full model
- **Faster loading** (~0.3 seconds vs 2-3 seconds)
- **Same accuracy** as the full model
- Perfect for **mobile apps** and **edge deployment**

## Files in this repository

- `adapter_model.safetensors`: The LoRA adapter weights (2.5 MB)
- `adapter_config.json`: LoRA configuration
- `training_stats.json`: Detailed training statistics
- `tokenizer.json` & `tokenizer_config.json`: Tokenizer files