| --- |
| library_name: transformers |
| tags: |
| - peft |
| - lora |
| - ottomanturkish |
| datasets: |
| - BUCOLIN/HisTR |
| language: |
| - tr |
| metrics: |
| - f1 |
| - precision |
| - recall |
| base_model: |
| - cihanunlu/BerTurk_Ottoman_Full_DAPT |
| pipeline_tag: token-classification |
| --- |
| |
| # Model Card for Model ID |
|
|
| <!-- Provide a quick summary of what the model is/does. --> |
|
|
| #### Overview |
|
|
| | | | |
| |---|---| |
| | **Base model** | [`cihanunlu/BerTurk_Ottoman_Full_DAPT`](https://huggingface.co/cihanunlu/BerTurk_Ottoman_Full_DAPT) | |
| | **Adapter type** | LoRA (Low-Rank Adaptation) built with HF **PEFT** | |
| | **Task** | Named-Entity Recognition • BIO tags **PER / LOC / O** | |
| | **Language** | Ottoman / Late-Ottoman Turkish (Latin transliteration) | |
| | **Repo contents** | ≈ 2 MB LoRA weights (`adapter_model.bin`, `adapter_config.json`) + tokenizer files | |
| |
| Attach it to the BerTurk_Ottoman_Full_DAPT checkpoint to obtain a lightweight NER model fine-tuned on the HiSTR corpus. |
| |
| * • Suitable for historical/Ottoman Turkish NER focusing on PERSON and LOCATION. |
| * • Performance drops on modern Turkish or domain-specific jargon. |
| * • Adapter inherits ethical constraints and biases of the base BerTurk model. |
| |
| |
| ### 3 Evaluation |
| |
| * **Dev (HiSTR)** — best checkpoint (epoch 4) |
| * Precision – 77.3 % |
| * Recall – 84.9 % |
| * F1 – 80.9 % |
| |
| * **Test (Rûznâmçe)** |
| * Precision – 54.4 % |
| * Recall – 52.8 % |
| * F1 – 53.6 % |
| |
| ### 4 Training hyper-parameters |
| |
| * LoRA rank **r**: 16 |
| * LoRA α: 16 |
| * Dropout: 0.10 |
| * Peak learning rate: 5 × 10⁻⁴ |
| * Effective batch size: 16 |
| * Epochs: 5 |
| * Mixed precision: FP16 |