--- license: apache-2.0 language: - en metrics: - character base_model: - google-bert/bert-base-uncased pipeline_tag: text-classification --- # Fine-Tuned BERT for Transaction Categorization This is a fine-tuned [BERT model](https://huggingface.co/transformers/model_doc/bert.html) specifically trained to categorize financial transactions into predefined categories. The model was trained on a dataset of English transaction descriptions to classify them into categories like "Groceries," "Transport," "Entertainment," and more. ## Model Details - **Base Model**: [bert-base-uncased](https://huggingface.co/bert-base-uncased). - **Fine-Tuning Task**: Transaction Categorization (multi-class classification). - **Languages**: English. ### Example Categories The model classifies transactions into categories such as: ```python CATEGORIES = { 0: "Utilities", 1: "Health", 2: "Dining", 3: "Travel", 4: "Education", 5: "Subscription", 6: "Family", 7: "Food", 8: "Festivals", 9: "Culture", 10: "Apparel", 11: "Transportation", 12: "Investment", 13: "Shopping", 14: "Groceries", 15: "Documents", 16: "Grooming", 17: "Entertainment", 18: "Social Life", 19: "Beauty", 20: "Rent", 21: "Money transfer", 22: "Salary", 23: "Tourism", 24: "Household", } ``` --- ## How to Use the Model To use this model, you can load it directly with Hugging Face's `transformers` library: ```python from transformers import BertTokenizer, BertForSequenceClassification # Load the model model_name = "kuro-08/bert-transaction-categorization" tokenizer = BertTokenizer.from_pretrained(model_name) model = BertForSequenceClassification.from_pretrained(model_name) # Sample transaction description transaction = "Transaction: Payment at Starbucks for coffee - Type: income/expense" inputs = tokenizer(transaction, return_tensors="pt", truncation=True, padding=True) # Predict the category outputs = model(**inputs) logits = outputs.logits predicted_category = logits.argmax(-1).item() print(f"Predicted category: {predicted_category}")