kuro-08
/

bert-transaction-categorization

Text Classification

Model card Files Files and versions

bert-transaction-categorization / README.md

kuro-08's picture

Update README.md

9fed73a verified about 1 year ago

|

history blame contribute delete

2.02 kB

	---
	license: apache-2.0
	language:
	- en
	metrics:
	- character
	base_model:
	- google-bert/bert-base-uncased
	pipeline_tag: text-classification
	---

	# Fine-Tuned BERT for Transaction Categorization

	This is a fine-tuned [BERT model](https://huggingface.co/transformers/model_doc/bert.html) specifically trained to categorize financial transactions into predefined categories. The model was trained on a dataset of English transaction descriptions to classify them into categories like "Groceries," "Transport," "Entertainment," and more.

	## Model Details

	- Base Model: [bert-base-uncased](https://huggingface.co/bert-base-uncased).
	- Fine-Tuning Task: Transaction Categorization (multi-class classification).
	- Languages: English.

	### Example Categories
	The model classifies transactions into categories such as:
	```python
	CATEGORIES = {

	0: "Utilities",
	1: "Health",
	2: "Dining",
	3: "Travel",
	4: "Education",
	5: "Subscription",
	6: "Family",
	7: "Food",
	8: "Festivals",
	9: "Culture",
	10: "Apparel",
	11: "Transportation",
	12: "Investment",
	13: "Shopping",
	14: "Groceries",
	15: "Documents",
	16: "Grooming",
	17: "Entertainment",
	18: "Social Life",
	19: "Beauty",
	20: "Rent",
	21: "Money transfer",
	22: "Salary",
	23: "Tourism",
	24: "Household",
	}
	```
	---

	## How to Use the Model

	To use this model, you can load it directly with Hugging Face's `transformers` library:

	```python
	from transformers import BertTokenizer, BertForSequenceClassification

	# Load the model
	model_name = "kuro-08/bert-transaction-categorization"
	tokenizer = BertTokenizer.from_pretrained(model_name)
	model = BertForSequenceClassification.from_pretrained(model_name)

	# Sample transaction description
	transaction = "Transaction: Payment at Starbucks for coffee - Type: income/expense"
	inputs = tokenizer(transaction, return_tensors="pt", truncation=True, padding=True)

	# Predict the category
	outputs = model(**inputs)
	logits = outputs.logits
	predicted_category = logits.argmax(-1).item()

	print(f"Predicted category: {predicted_category}")