YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

FinBERT-Multilingual-Intent-and-Sentiment

Overview

FinBERT-Multilingual-Intent-and-Sentiment is a BERT-based model fine-tuned for joint multi-label classification on financial customer service messages. It is designed to simultaneously classify the Intent (e.g., 'Account_Access_Issue', 'Mortgage_Inquiry') and the Sentiment/Urgency (e.g., 'Negative', 'Anxious', 'Positive') of a message. The model is trained on a multilingual corpus, supporting English (en), Spanish (es), French (fr), German (de), and Portuguese (pt).

The model leverages the robustness of a multilingual BERT base and adapts it specifically for the high-stakes financial domain, making it ideal for automating customer service routing and prioritization.

Model Architecture

The model is built upon the bert-base-multilingual-cased backbone.

  • Base Model: BERT (Bidirectional Encoder Representations from Transformers)
  • Task: Multi-label Sequence Classification (BertForSequenceClassification)
  • Input Languages: English, Spanish, French, German, Portuguese
  • Output: 8 classification labels (4 Intent labels, 4 Sentiment/Urgency labels). The model is optimized with a Sigmoid activation function and Binary Cross-Entropy Loss to handle independent multi-label prediction.
  • Max Sequence Length: 512 tokens

Intended Use

  • Automated Customer Service Routing: Directing messages based on intent (e.g., Fraud alerts go to the security team, Mortgage inquiries go to the loan department).
  • Prioritization: Flagging messages with 'Negative' or 'Anxious' sentiment for urgent human intervention.
  • Financial Market Monitoring: Analyzing sentiment in multilingual financial news or social media snippets.
  • Research: As a strong baseline for transfer learning in related low-resource financial NLP tasks.

Limitations

  • Domain Specificity: The model performs best on messages related to banking, trading, loans, and financial services. Performance degrades on general domain or highly technical domain text (e.g., deep quantitative finance).
  • Language Scope: While multilingual, it is limited to the five languages specified in the training set. Performance on other languages is not guaranteed.
  • Multi-label Ambiguity: While trained for multi-label, complex messages with mixed intent or complex sarcasm may lead to lower confidence scores.

Example Code (PyTorch)

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "Finance/FinBERT-Multilingual-Intent-and-Sentiment"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Example 1: Spanish Complaint
text_es = "El virement que hice no ha llegado. Es un servicio terrible."
# Example 2: English Inquiry
text_en = "I need to increase my credit card limit before I travel next week."

# Process both texts
inputs = tokenizer([text_es, text_en], return_tensors="pt", padding=True, truncation=True)
outputs = model(**inputs)

# Get predictions
logits = outputs.logits
probabilities = torch.sigmoid(logits)

# Apply threshold (e.g., 0.5) to get multi-label predictions
threshold = 0.5
predictions = (probabilities > threshold).int()

# Map predictions back to labels
labels = model.config.id2label
predicted_labels = []
for p in predictions:
    active_labels = [labels[i] for i, val in enumerate(p) if val == 1]
    predicted_labels.append(active_labels)

print(f"Spanish Message Labels: {predicted_labels[0]}")
# Expected output: ['intent: Transfer_Delay_Complaint', 'sentiment: Negative']
print(f"English Message Labels: {predicted_labels[1]}")
# Expected output: ['intent: Credit_Limit_Increase_Request', 'sentiment: Neutral']
Downloads last month
28
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support