|
|
--- |
|
|
library_name: peft |
|
|
base_model: LSX-UniWue/ModernGBERT_1B |
|
|
tags: |
|
|
- base_model:adapter:LSX-UniWue/ModernGBERT_1B |
|
|
- lora |
|
|
- transformers |
|
|
- token-classification |
|
|
--- |
|
|
|
|
|
|
|
|
# ModernGBERT Redewiedergabe Tagger |
|
|
|
|
|
This model is a token classifier that recognizes German speech, thought and writing representation (STWR), that is being used in [LLpro](https://github.com/cophi-wue/LLpro). Besides the *medium* (speech, thought, writing) the model also predicts the *type* (direct, free indirect, indirect, reported) by providing 36 classification outputs (3 media × 4 types × B,I,O). |
|
|
|
|
|
| STWR type | Example | Translation | |
|
|
|--------------------------------|-------------------------------------------------------------------------|----------------------------------------------------------| |
|
|
| direct | Dann schrieb er: **"Ich habe Hunger."** | Then he wrote: **"I'm hungry."** | |
|
|
| free indirect ('erlebte Rede') | Er war ratlos. **Woher sollte er denn hier bloß ein Mittagessen bekommen?** | He was at a loss. **Where should he ever find lunch here?** | |
|
|
| indirect | Sie fragte, **wo das Essen sei.** | She asked **where the food was.** | |
|
|
| reported | **Sie dachte über das Mittagessen.** | **She thought about lunch.** | |
|
|
|
|
|
This model is a fine-tuned version of [LSX-UniWue/ModernGBERT_1B](https://huggingface.co/LSX-UniWue/ModernGBERT_1B) on the [REDEWIEDERGABE corpus](https://github.com/redewiedergabe/corpus) ([Annotation guidelines](http://redewiedergabe.de/richtlinien/richtlinien.html)). |
|
|
|
|
|
[Training Script](https://github.com/cophi-wue/LLpro/blob/main/contrib/train_redewiedergabe.py). |
|
|
|
|
|
### Performance |
|
|
|
|
|
We report simplified F1 scores on a binarized variant (O vs B/I) for each speech type resp. medium. |
|
|
|
|
|
| Type, Medium | F1 Score | support | |
|
|
|:----------------------|---------:|---------:| |
|
|
| direct.speech | 0.96 | 13598 | |
|
|
| direct.thought | 0.79 | 715 | |
|
|
| direct.writing | 0.19 | 996 | |
|
|
| indirect.speech | 0.77 | 1226 | |
|
|
| indirect.thought | 0.71 | 802 | |
|
|
| indirect.writing | 0.00 | 11 | |
|
|
| freeIndirect.speech | 0.73 | 198 | |
|
|
| freeIndirect.thought | 0.45 | 251 | |
|
|
| freeIndirect.writing | – | – | |
|
|
| reported.speech | 0.69 | 1684 | |
|
|
| reported.thought | 0.59 | 799 | |
|
|
| reported.writing | 0.56 | 135 | |
|
|
| *micro avg* | *0.86* | 20415 | |
|
|
| *macro avg* | *0.54* | 20415 | |
|
|
|
|
|
### Demo Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForTokenClassification, AutoTokenizer |
|
|
|
|
|
speech_labels = [ |
|
|
"direct.speech", |
|
|
"direct.thought", |
|
|
"direct.writing", |
|
|
"indirect.speech", |
|
|
"indirect.thought", |
|
|
"indirect.writing", |
|
|
"freeIndirect.speech", |
|
|
"freeIndirect.thought", |
|
|
"freeIndirect.writing", |
|
|
"reported.speech", |
|
|
"reported.thought", |
|
|
"reported.writing", |
|
|
] |
|
|
|
|
|
text = """ |
|
|
Dann schrieb er: 'Ich habe Hunger." |
|
|
Er war ratlos. Woher sollte er denn hier bloß ein Mittagessen bekommen? |
|
|
Sie fragte, wo das Essen sei. |
|
|
Sie dachte über das Mittagessen.""" |
|
|
|
|
|
model_id = 'aehrm/moderngbert-redewiedergabe' |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
model = AutoModelForTokenClassification.from_pretrained(model_id, num_labels=3*len(speech_labels)) |
|
|
|
|
|
inputs = tokenizer(text, return_tensors='pt') |
|
|
out = model(**inputs) |
|
|
|
|
|
batch_size, seq_len, _ = out.logits.shape |
|
|
prediction = out.logits.reshape(batch_size, seq_len, 12, 3).argmax(-1) |
|
|
|
|
|
for i, speech_label in enumerate(speech_labels): |
|
|
for tok, pred in zip(tokenizer.convert_ids_to_tokens(inputs['input_ids'][0]), prediction[0,:,i]): |
|
|
pred = 'OBI'[pred] |
|
|
print(tok, pred, speech_label if pred != 'O' else '') |
|
|
``` |
|
|
|
|
|
### Training results |
|
|
|
|
|
F1 Score refers to the micro average over all 36 classification outputs. |
|
|
|
|
|
| Training Loss | Epoch | Step | Validation Loss | F1 Score | |
|
|
|:-------------:|:-----:|:----:|:---------------:|:--------:| |
|
|
| 0.1951 | 1.0 | 193 | 0.1332 | 0.1180 | |
|
|
| 0.0885 | 2.0 | 386 | 0.2474 | 0.2724 | |
|
|
| 0.0417 | 3.0 | 579 | 0.1455 | 0.4604 | |
|
|
| 0.0753 | 4.0 | 772 | 0.1399 | 0.5522 | |
|
|
| 0.0277 | 5.0 | 965 | 0.1447 | 0.6170 | |
|
|
| 0.0238 | 6.0 | 1158 | 0.1770 | 0.6200 | |
|
|
| 0.0153 | 7.0 | 1351 | 0.2257 | 0.6930 | |
|
|
| 0.009 | 8.0 | 1544 | 0.6031 | 0.7336 | |
|
|
| 0.0108 | 9.0 | 1737 | 0.4965 | 0.7428 | |
|
|
| 0.0066 | 10.0 | 1930 | 0.4575 | 0.7492 | |
|
|
| 0.0058 | 11.0 | 2123 | 0.7781 | 0.7983 | |
|
|
| 0.006 | 12.0 | 2316 | 0.8648 | 0.8062 | |
|
|
| 0.0043 | 13.0 | 2509 | 1.0377 | 0.8148 | |
|
|
| 0.0033 | 14.0 | 2702 | 1.3040 | 0.8217 | |
|
|
| 0.0025 | 15.0 | 2895 | 1.2637 | 0.8359 | |
|
|
| 0.003 | 16.0 | 3088 | 1.3230 | 0.8477 | |
|
|
| 0.0019 | 17.0 | 3281 | 1.9811 | 0.8439 | |
|
|
| 0.0014 | 18.0 | 3474 | 2.1191 | 0.8482 | |
|
|
| 0.0011 | 19.0 | 3667 | 2.3599 | 0.8510 | |
|
|
| 0.0009 | 20.0 | 3860 | 2.4453 | 0.8528 | |
|
|
|
|
|
|
|
|
### Framework versions |
|
|
|
|
|
- PEFT 0.17.0 |
|
|
- Transformers 4.55.2 |
|
|
- Pytorch 2.8.0+cu128 |
|
|
- Datasets 2.21.0 |
|
|
- Tokenizers 0.21.4 |
|
|
|