cihanunlu's picture
Update README.md
166678f verified
metadata
library_name: transformers
tags:
  - peft
  - lora
  - ottomanturkish
datasets:
  - BUCOLIN/HisTR
language:
  - tr
metrics:
  - f1
  - precision
  - recall
base_model:
  - cihanunlu/BerTurk_Ottoman_Full_DAPT
pipeline_tag: token-classification

Model Card for Model ID

Overview

Base model cihanunlu/BerTurk_Ottoman_Full_DAPT
Adapter type LoRA (Low-Rank Adaptation) built with HF PEFT
Task Named-Entity Recognition  •  BIO tags PER / LOC / O
Language Ottoman / Late-Ottoman Turkish (Latin transliteration)
Repo contents ≈ 2 MB LoRA weights (adapter_model.bin, adapter_config.json) + tokenizer files

Attach it to the BerTurk_Ottoman_Full_DAPT checkpoint to obtain a lightweight NER model fine-tuned on the HiSTR corpus.

  • • Suitable for historical/Ottoman Turkish NER focusing on PERSON and LOCATION.
  • • Performance drops on modern Turkish or domain-specific jargon.
  • • Adapter inherits ethical constraints and biases of the base BerTurk model.

3 Evaluation

  • Dev (HiSTR) — best checkpoint (epoch 4)

    • Precision – 77.3 %
    • Recall – 84.9 %
    • F1 – 80.9 %
  • Test (Rûznâmçe)

    • Precision – 54.4 %
    • Recall – 52.8 %
    • F1 – 53.6 %

4 Training hyper-parameters

  • LoRA rank r: 16
  • LoRA α: 16
  • Dropout: 0.10
  • Peak learning rate: 5 × 10⁻⁴
  • Effective batch size: 16
  • Epochs: 5
  • Mixed precision: FP16