Model Card for German-Austrian Historical NER

This token-classification model aims to perform Named Entity Recognition on German-Austrian historical documents.
The model has been trained using the tagged entities 10319 samples provided by https://nerdpool-api.acdh-dev.oeaw.ac.at/.
The model has been trained to identify entities from the Minutes of the Austian Council of Ministries.

Developed by: Dimitra Grigoriou
Shared by: Dimitra Grigoriou
Model type: token classification
Language(s) (NLP): German, Austrian German
License: CC By-4.0
Finetuned from model : google-bert-case

Uses

Named Entity Recignition

Direct Use

from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline

model = AutoModelForTokenClassification.from_pretrained("demigrigo/mpr_bert_german_ner")
tokenizer = AutoTokenizer.from_pretrained("demigrigo/mpr_bert_german_ner")

nlp = pipeline("token-classification", model=model, tokenizer=tokenizer, aggregation_strategy="average") 

text = "Ernennung FML. Peter Zaninis zum Kriegsminister" ##example sentence
print(nlp(text))

Training Details

Training Data

Training data from: https://nerdpool-api.acdh-dev.oeaw.ac.at/
The data transformed into BIO tagging style required by the original model.

Downloads last month: 58

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for demigrigo/mpr_bert_german_ner

Base model

google-bert/bert-base-cased

Finetuned

(2920)

this model

Paper for demigrigo/mpr_bert_german_ner

Quantifying the Carbon Emissions of Machine Learning

Paper • 1910.09700 • Published Oct 21, 2019 • 52