File size: 6,185 Bytes
50f8fb6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
---
license: apache-2.0
language:
  - en
  - zh
  - ja
  - de
  - fr
  - es
tags:
  - finance
  - sentiment-analysis
  - multilingual
  - xlm-roberta
  - finbert
datasets:
  - Kenpache/multilingual-financial-sentiment
metrics:
  - accuracy
  - f1
pipeline_tag: text-classification
model-index:
  - name: FinBERT-Multilingual
    results:
      - task:
          type: text-classification
          name: Financial Sentiment Analysis
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.8103
          - name: F1 (weighted)
            type: f1
            value: 0.8102
---

# FinBERT-Multilingual

A multilingual extension of the FinBERT paradigm: domain-adapted transformer for financial sentiment classification across six languages (EN, ZH, JA, DE, FR, ES).

While the original [FinBERT](https://arxiv.org/abs/1908.10063) demonstrated the effectiveness of domain-specific pre-training for English financial NLP, this model extends that approach to a multilingual setting using XLM-RoBERTa-base as the backbone, enabling cross-lingual financial sentiment analysis without language-specific models.

## Model Architecture

- **Base model:** `xlm-roberta-base` (278M parameters)
- **Task:** 3-class sequence classification (Negative / Neutral / Positive)
- **Domain adaptation:** Task-Adaptive Pre-Training (TAPT) via Masked Language Modeling on 35K+ financial texts
- **Languages:** English, Chinese, Japanese, German, French, Spanish

## Training Pipeline

### Stage 1: Task-Adaptive Pre-Training (TAPT)

Following [Gururangan et al. (2020)](https://arxiv.org/abs/2004.10964), we perform continued MLM pre-training on the unlabeled financial corpus to adapt the model's representations to the financial domain. This stage exposes the model to domain-specific vocabulary and discourse patterns across all six target languages using approximately 35,000 financial text samples.

### Stage 2: Supervised Fine-Tuning

The domain-adapted model is then fine-tuned on the labeled sentiment classification task.

**Hyperparameters:**

| Parameter | Value |
|---|---|
| Learning rate | 2e-5 |
| LR scheduler | Cosine annealing |
| Label smoothing | 0.1 |
| Checkpoint selection | SWA (top-3 checkpoints) |
| Base model | xlm-roberta-base |

**Stochastic Weight Averaging (SWA):** Rather than selecting a single best checkpoint, we average the weights of the top-3 performing checkpoints. This produces a flatter loss minimum and more robust generalization, particularly beneficial for multilingual settings where overfitting to dominant languages is a risk.

**Label smoothing (0.1):** Prevents overconfident predictions and improves calibration, which is important for financial applications where prediction confidence informs downstream decisions.

## Evaluation Results

### Overall Metrics

| Metric | Score |
|---|---|
| Accuracy | 0.8103 |
| F1 (weighted) | 0.8102 |
| Precision (weighted) | 0.8111 |
| Recall (weighted) | 0.8103 |

### Per-Class Performance

| Class | Precision | Recall | F1-Score |
|---|---|---|---|
| Negative | 0.78 | 0.83 | 0.81 |
| Neutral | 0.83 | 0.79 | 0.81 |
| Positive | 0.80 | 0.82 | 0.81 |

The balanced per-class performance (all F1 scores at 0.81) indicates that the model does not exhibit significant class bias, despite the imbalanced training distribution (Neutral: 45.5%, Positive: 30.8%, Negative: 23.7%).

## Usage

```python
from transformers import pipeline

classifier = pipeline("text-classification", model="Kenpache/finbert-multilingual")

# English
classifier("The company reported record quarterly earnings, driven by strong demand.")
# [{'label': 'positive', 'score': 0.95}]

# German
classifier("Die Aktie verlor nach der Gewinnwarnung deutlich an Wert.")
# [{'label': 'negative', 'score': 0.92}]

# Japanese
classifier("同社の売上高は前年同期比で横ばいとなった。")
# [{'label': 'neutral', 'score': 0.88}]

# Chinese
classifier("该公司宣布大规模裁员计划,股价应声下跌。")
# [{'label': 'negative', 'score': 0.91}]
```

### Direct Model Loading

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("Kenpache/finbert-multilingual")
model = AutoModelForSequenceClassification.from_pretrained("Kenpache/finbert-multilingual")

text = "Les bénéfices du groupe ont augmenté de 15% au premier trimestre."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=-1)
    pred = torch.argmax(probs, dim=-1).item()

labels = {0: "negative", 1: "neutral", 2: "positive"}
print(f"Prediction: {labels[pred]} ({probs[0][pred]:.4f})")
```

## Training Data

The model was trained on [Kenpache/multilingual-financial-sentiment](https://huggingface.co/datasets/Kenpache/multilingual-financial-sentiment), a curated dataset of ~39K financial news sentences from 80+ sources across six languages.

| Language | Samples | Sources |
|---|---|---|
| Japanese | 8,287 | Nikkei, Nikkan Kogyo, Reuters JP, Minkabu, etc. |
| Chinese | 7,930 | Sina Finance, EastMoney, 10jqka, etc. |
| Spanish | 7,125 | Expansión, Cinco Días, Bloomberg Línea, etc. |
| English | 6,887 | CNBC, Yahoo Finance, Fortune, Benzinga, etc. |
| German | 5,023 | Börse.de, FAZ, NTV Börse, Handelsblatt, etc. |
| French | 3,935 | Boursorama, Tradingsat, BFM Business, etc. |

## Comparison with FinBERT

| Feature | FinBERT | FinBERT-Multilingual |
|---|---|---|
| Base model | BERT-base | XLM-RoBERTa-base |
| Languages | English only | 6 languages |
| Domain adaptation | Financial corpus pre-training | TAPT on multilingual financial texts |
| Classes | 3 (Pos/Neg/Neu) | 3 (Pos/Neg/Neu) |
| Checkpoint selection | Single best | SWA (top-3) |

## Citation

If you use this model in your research, please cite:

```bibtex
@misc{finbert-multilingual-2025,
  title={FinBERT-Multilingual: Cross-Lingual Financial Sentiment Analysis with Domain-Adapted XLM-RoBERTa},
  author={Kenpache},
  year={2025},
  url={https://huggingface.co/Kenpache/finbert-multilingual}
}
```

## License

Apache 2.0