--- license: cc-by-4.0 language: - de base_model: - google-bert/bert-base-cased pipeline_tag: token-classification tags: - ner - german - historical_texts - history - deutsch --- # Model Card for German-Austrian Historical NER This token-classification model aims to perform Named Entity Recognition on German-Austrian historical documents.
The model has been trained using the tagged entities 10319 samples provided by https://nerdpool-api.acdh-dev.oeaw.ac.at/.
The model has been trained to identify entities from the Minutes of the Austian Council of Ministries. - **Developed by:** Dimitra Grigoriou - **Shared by**: Dimitra Grigoriou - **Model type:** token classification - **Language(s) (NLP):** German, Austrian German - **License:** CC By-4.0 - **Finetuned from model :** google-bert-case ## Uses - Named Entity Recignition ### Direct Use ```python from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline model = AutoModelForTokenClassification.from_pretrained("demigrigo/mpr_bert_german_ner") tokenizer = AutoTokenizer.from_pretrained("demigrigo/mpr_bert_german_ner") nlp = pipeline("token-classification", model=model, tokenizer=tokenizer, aggregation_strategy="average") text = "Ernennung FML. Peter Zaninis zum Kriegsminister" ##example sentence print(nlp(text)) ``` ## Training Details ### Training Data Training data from: https://nerdpool-api.acdh-dev.oeaw.ac.at/
The data transformed into BIO tagging style required by the original model.