File size: 3,031 Bytes
11b0183
 
 
 
929c922
 
 
 
11b0183
9c028ac
a92ad2d
 
11b0183
 
 
59a396e
11b0183
 
 
 
 
 
23affb0
 
11b0183
 
 
 
 
 
 
 
 
 
 
59a396e
23affb0
 
11b0183
 
 
 
 
 
 
 
 
 
 
23affb0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
---
license: apache-2.0
language:
- de
widget:
- text: "STS Group AG erhält Großauftrag von führendem Nutzfahrzeughersteller in Nordamerika und plant Bau eines ersten US-Werks"
- text: "Zukünftig soll jedoch je Geschäftsjahr eine Mindestdividende in Höhe von EUR 2,00 je dividendenberechtigter Aktie an die Aktionärinnen und Aktionäre ausgeschüttet werden."
- text: "Comet passt Jahresprognose nach Q3 unter Erwartungen an"
---
# German FinBERT For Sentiment Analysis (Pre-trained From Scratch Version, Fine-Tuned for Financial Sentiment Analysis)
<img src="https://github.com/mscherrmann/mscherrmann.github.io/blob/master/assets/img/publication_preview/germanBert.png?raw=true" alt="Alt text for the image" width="500" height="300"/>


German FinBERT is a BERT language model focusing on the financial domain within the German language. In my [paper](https://arxiv.org/pdf/2311.08793.pdf), I describe in more detail the steps taken to train the model and show that it outperforms its generic benchmarks for finance specific downstream tasks. 

This model is the [pre-trained from scratch version of German FinBERT](https://huggingface.co/scherrmann/GermanFinBert_SC), after fine-tuning on a translated version of the [financial news phrase bank](https://arxiv.org/abs/1307.5336) of Malo et al. (2013). The data is available [here](https://huggingface.co/datasets/scherrmann/financial_phrasebank_75agree_german).

## Overview  
**Author** Moritz Scherrmann
**Paper:** [here](https://arxiv.org/pdf/2311.08793.pdf)  
**Architecture:** BERT base 
**Language:** German  
**Specialization:** Financial sentiment
**Base model:** [German_FinBert_SC](https://huggingface.co/scherrmann/GermanFinBert_SC)


### Fine-tuning

I fine-tune the model using the 1cycle policy of [Smith and Topin (2019)](https://arxiv.org/abs/1708.07120). I use the Adam optimization method of [Kingma and Ba (2014)](https://arxiv.org/abs/1412.6980) with
standard parameters.I run a grid search on the evaluation set to find the best hyper-parameter setup. I test different
values for learning rate, batch size and number of epochs, following the suggestions of [Chalkidis et al. (2020)](https://aclanthology.org/2020.findings-emnlp.261/). I repeat the fine-tuning for each setup five times with different seeds, to avoid getting good results by chance.
After finding the best model w.r.t the evaluation set, I report the mean result across seeds for that model on the test set.

### Results

Translated [Financial news phrase bank](https://arxiv.org/abs/1307.5336) (Malo et al. (2013)), see [here](https://huggingface.co/datasets/scherrmann/financial_phrasebank_75agree_german) for the data:
- Accuracy: 95.95%
- Macro F1: 92.70%


## Authors  
Moritz Scherrmann: `scherrmann [at] lmu.de`


For additional details regarding the performance on fine-tune datasets and benchmark results, please refer to the full documentation provided in the study.

See also:  
- scherrmann/GermanFinBERT_SC
- scherrmann/GermanFinBERT_FP
- scherrmann/GermanFinBERT_FP_QuAD