File size: 5,675 Bytes
a6d8bbf
 
 
 
 
 
 
 
 
 
 
 
 
 
af07ccd
 
 
a6d8bbf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
af07ccd
a6d8bbf
 
 
af07ccd
a6d8bbf
af07ccd
a6d8bbf
 
 
 
 
 
 
 
af07ccd
 
 
a6d8bbf
af07ccd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a6d8bbf
 
 
 
 
 
af07ccd
 
 
 
a6d8bbf
af07ccd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a6d8bbf
af07ccd
 
 
 
 
 
a6d8bbf
af07ccd
a6d8bbf
af07ccd
 
 
 
 
 
a6d8bbf
af07ccd
a6d8bbf
af07ccd
 
 
 
 
 
 
 
 
 
 
 
 
a6d8bbf
 
 
af07ccd
 
 
 
 
 
 
 
 
 
 
 
a6d8bbf
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
---
license: apache-2.0
language:
  - en
  - zh
  - ja
  - de
  - fr
  - es
tags:
  - finance
  - sentiment-analysis
  - multilingual
  - xlm-roberta
  - financial-nlp
  - stock-market
  - trading
datasets:
  - Kenpache/multilingual-financial-sentiment
metrics:
  - accuracy
  - f1
pipeline_tag: text-classification
model-index:
  - name: FLAME
    results:
      - task:
          type: text-classification
          name: Financial Sentiment Analysis
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.8103
          - name: F1 (weighted)
            type: f1
            value: 0.8102
---

# FLAME — Financial Language Analysis for Multilingual Economics

**One model. Six languages. Real financial sentiment.**

FLAME classifies financial text as **Negative**, **Neutral**, or **Positive** across English, Chinese, Japanese, German, French, and Spanish — in a single model, no language detection needed.

Built on XLM-RoBERTa with domain-adaptive pretraining on 35K+ financial texts, then fine-tuned on ~39K real financial news samples from 80+ sources worldwide.

## Quick Start

```python
from transformers import pipeline

classifier = pipeline("text-classification", model="Kenpache/flame")

# English
classifier("Apple reported record quarterly revenue of $124 billion, up 11% year over year.")
# [{'label': 'Positive', 'score': 0.96}]

# Chinese
classifier("该公司季度亏损扩大至5亿美元,远超市场预期。")
# [{'label': 'Negative', 'score': 0.94}]

# Japanese
classifier("トヨタ自動車の営業利益は前年同期比30%増の1兆円を突破した。")
# [{'label': 'Positive', 'score': 0.95}]

# German
classifier("Die Aktie verlor nach der Gewinnwarnung deutlich an Wert.")
# [{'label': 'Negative', 'score': 0.92}]

# French
classifier("Le chiffre d'affaires du groupe a progressé de 8% au premier semestre.")
# [{'label': 'Positive', 'score': 0.93}]

# Spanish
classifier("Las acciones de la empresa se mantuvieron estables tras la publicación de resultados.")
# [{'label': 'Neutral', 'score': 0.89}]
```

## Batch Processing

```python
from transformers import pipeline

classifier = pipeline("text-classification", model="Kenpache/flame", device=0)

texts = [
    "Stocks rallied after the Fed signaled a pause in rate hikes.",
    "The company filed for Chapter 11 bankruptcy protection.",
    "Q3 earnings were in line with analyst expectations.",
    "日経平均株価が3万円台を回復した。",
    "Les marchés européens ont clôturé en forte baisse.",
    "El beneficio neto de la compañía creció un 25% interanual.",
]

results = classifier(texts, batch_size=32)
for text, result in zip(texts, results):
    print(f"{result['label']:>8} ({result['score']:.2f})  {text[:70]}")
```

## Results

| Metric | Score |
|---|---|
| **Accuracy** | **0.8103** |
| **F1 (weighted)** | **0.8102** |
| **Precision (weighted)** | **0.8111** |
| **Recall (weighted)** | **0.8103** |

### Per-Class Performance

| Class | Precision | Recall | F1 | Support |
|---|---|---|---|---|
| Negative | 0.78 | 0.83 | 0.81 | 917 |
| Neutral | 0.83 | 0.79 | 0.81 | 1,779 |
| Positive | 0.80 | 0.82 | 0.81 | 1,225 |

All three classes achieve balanced F1=0.81, even with imbalanced training data (Neutral 45%, Positive 31%, Negative 24%).

## Labels

| Label | ID | What it captures |
|---|---|---|
| **Negative** | 0 | Losses, decline, bearish signals, layoffs, bankruptcy |
| **Neutral** | 1 | Factual statements, announcements, no clear sentiment |
| **Positive** | 2 | Growth, gains, bullish signals, record earnings, upgrades |

## Supported Languages

| Language | Code | Training Samples | Key Sources |
|---|---|---|---|
| Japanese | JA | 8,287 | Nikkei, Nikkan Kogyo, Reuters JP |
| Chinese | ZH | 7,930 | Sina Finance, EastMoney, 10jqka |
| Spanish | ES | 7,125 | Expansión, Cinco Días, Bloomberg Línea |
| English | EN | 6,887 | CNBC, Yahoo Finance, Fortune, Reuters |
| German | DE | 5,023 | Börse.de, FAZ, NTV Börse |
| French | FR | 3,935 | Boursorama, Tradingsat, BFM Business |

## Use Cases

- **News Monitoring** — classify sentiment of financial headlines across global markets in real time
- **Trading Signals** — feed sentiment scores into quantitative trading strategies
- **Portfolio Risk** — monitor sentiment shifts across international holdings
- **Earnings Analysis** — analyze tone of corporate press releases and earnings calls
- **Social Media** — track financial discussions on multilingual platforms
- **Research** — cross-language sentiment studies in financial NLP

## How It Was Built

1. **Domain Adaptation (TAPT):** Masked Language Modeling on 35K+ financial texts across 6 languages — the model learns financial vocabulary and patterns before seeing any labels.

2. **Fine-Tuning:** Supervised classification with label smoothing (0.1), cosine LR schedule (2e-5), and Stochastic Weight Averaging of top-3 checkpoints for robust generalization.

| Parameter | Value |
|---|---|
| Base model | xlm-roberta-base (278M params) |
| Learning rate | 2e-5 |
| Scheduler | Cosine |
| Label smoothing | 0.1 |
| Effective batch size | 64 |
| Precision | FP16 |
| Post-processing | SWA (top-3 checkpoints) |

## Dataset

Trained on [Kenpache/multilingual-financial-sentiment](https://huggingface.co/datasets/Kenpache/multilingual-financial-sentiment) — ~39K curated financial news samples from 80+ real sources worldwide.

## Citation

```bibtex
@misc{flame2025,
  title={FLAME: Financial Language Analysis for Multilingual Economics},
  author={Kenpache},
  year={2025},
  url={https://huggingface.co/Kenpache/flame}
}
```

## License

Apache 2.0