cihanunlu
/

berturk-histr-lora-ner

Token Classification

Model card Files Files and versions

berturk-histr-lora-ner / README.md

cihanunlu's picture

Update README.md

166678f verified 11 months ago

|

history blame contribute delete

1.61 kB

	---
	library_name: transformers
	tags:
	- peft
	- lora
	- ottomanturkish
	datasets:
	- BUCOLIN/HisTR
	language:
	- tr
	metrics:
	- f1
	- precision
	- recall
	base_model:
	- cihanunlu/BerTurk_Ottoman_Full_DAPT
	pipeline_tag: token-classification
	---

	# Model Card for Model ID

	<!-- Provide a quick summary of what the model is/does. -->

	#### Overview

	\| \| \|
	\|---\|---\|
	\| Base model \| [`cihanunlu/BerTurk_Ottoman_Full_DAPT`](https://huggingface.co/cihanunlu/BerTurk_Ottoman_Full_DAPT) \|
	\| Adapter type \| LoRA (Low-Rank Adaptation) built with HF PEFT \|
	\| Task \| Named-Entity Recognition  •  BIO tags PER / LOC / O \|
	\| Language \| Ottoman / Late-Ottoman Turkish (Latin transliteration) \|
	\| Repo contents \| ≈ 2 MB LoRA weights (`adapter_model.bin`, `adapter_config.json`) + tokenizer files \|

	Attach it to the BerTurk_Ottoman_Full_DAPT checkpoint to obtain a lightweight NER model fine-tuned on the HiSTR corpus.

	* • Suitable for historical/Ottoman Turkish NER focusing on PERSON and LOCATION.
	* • Performance drops on modern Turkish or domain-specific jargon.
	* • Adapter inherits ethical constraints and biases of the base BerTurk model.


	### 3 Evaluation

	* Dev (HiSTR) — best checkpoint (epoch 4)
	* Precision – 77.3 %
	* Recall – 84.9 %
	* F1 – 80.9 %

	* Test (Rûznâmçe)
	* Precision – 54.4 %
	* Recall – 52.8 %
	* F1 – 53.6 %

	### 4 Training hyper-parameters

	* LoRA rank r: 16
	* LoRA α: 16
	* Dropout: 0.10
	* Peak learning rate: 5 × 10⁻⁴
	* Effective batch size: 16
	* Epochs: 5
	* Mixed precision: FP16