cihanunlu's picture
Update README.md
166678f verified
---
library_name: transformers
tags:
- peft
- lora
- ottomanturkish
datasets:
- BUCOLIN/HisTR
language:
- tr
metrics:
- f1
- precision
- recall
base_model:
- cihanunlu/BerTurk_Ottoman_Full_DAPT
pipeline_tag: token-classification
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
#### Overview
| | |
|---|---|
| **Base model** | [`cihanunlu/BerTurk_Ottoman_Full_DAPT`](https://huggingface.co/cihanunlu/BerTurk_Ottoman_Full_DAPT) |
| **Adapter type** | LoRA (Low-Rank Adaptation) built with HF **PEFT** |
| **Task** | Named-Entity Recognition &nbsp;•&nbsp; BIO tags **PER / LOC / O** |
| **Language** | Ottoman / Late-Ottoman Turkish (Latin transliteration) |
| **Repo contents** | ≈ 2 MB LoRA weights (`adapter_model.bin`, `adapter_config.json`) + tokenizer files |
Attach it to the BerTurk_Ottoman_Full_DAPT checkpoint to obtain a lightweight NER model fine-tuned on the HiSTR corpus.
* • Suitable for historical/Ottoman Turkish NER focusing on PERSON and LOCATION.
* • Performance drops on modern Turkish or domain-specific jargon.
* • Adapter inherits ethical constraints and biases of the base BerTurk model.
### 3 Evaluation
* **Dev (HiSTR)** — best checkpoint (epoch 4)
* Precision – 77.3 %
* Recall – 84.9 %
* F1 – 80.9 %
* **Test (Rûznâmçe)**
* Precision – 54.4 %
* Recall – 52.8 %
* F1 – 53.6 %
### 4 Training hyper-parameters
* LoRA rank **r**: 16
* LoRA α: 16
* Dropout: 0.10
* Peak learning rate: 5 × 10⁻⁴
* Effective batch size: 16
* Epochs: 5
* Mixed precision: FP16