Karez
/

KHLR

@@ -1,79 +1,79 @@
----
-language:
-  - ar
-license: cc-by-nc-4.0
-tags:
-  - handwritten-text-recognition
-  - arabic
-  - khatt
-  - densenet
-  - transformer
-  - transfer-learning
-  - pytorch
-  - safetensors
-datasets:
-  - KHATT
-  - DASTNUS
-metrics:
-  - cer
-  - wer
-pipeline_tag: image-to-text
----
-# Arabic Handwritten Text Recognition: DenseNet121-Transformer (Fine-tuned on KHATT)
-## Model Description
-A lightweight DenseNet121-Transformer architecture for Arabic handwritten line recognition,
-pre-trained on the Kurdish DASTNUS dataset and fine-tuned on the KHATT Arabic handwritten dataset.
-Uses a triple unified vocabulary covering Kurdish, Arabic, and Urdu scripts (192 tokens).
-## Architecture
-- **CNN Backbone:** DenseNet-121 (pretrained on ImageNet)
-- **Encoder:** 3 Transformer encoder layers
-- **Decoder:** 3 Transformer decoder layers
-- **Attention Heads:** 8
-- **Hidden Size:** 256
-- **Parameters:** ~12.8M
-- **Vocabulary:** 192 tokens (Triple unified: Kurdish + Arabic + Urdu)
-## Transfer Learning Pipeline
-1. Pre-trained on Kurdish DASTNUS dataset (with unified vocabulary)
-2. Fine-tuned on KHATT Arabic handwritten line dataset
-## Performance on KHATT Test Set
-| Metric | Value |
-|--------|-------|
-| CER | 0.1135 |
-| WER | 0.4156 |
-| CRR | 88.65% |
-## Training Data
-- **Pre-training:** DASTNUS Kurdish handwritten dataset
-- **Fine-tuning:** KHATT Arabic handwritten dataset (5,166 training, 574 validation)
-## Usage
-```python
-from safetensors.torch import load_file
-import json
-# Load model weights
-state_dict = load_file("model.safetensors")
-# Load config
-with open("config.json", "r") as f:
-    config = json.load(f)
-# Load vocabulary
-with open("vocab.json", "r", encoding="utf-8") as f:
-    vocab = json.load(f)
-# Load full unified vocabulary info
-with open("unified_vocabulary.json", "r", encoding="utf-8") as f:
-    unified_vocab = json.load(f)
-```
-## Citation
-[]
-## License
-This model is released for non-commercial scientific research purposes only.

+---
+language:
+  - ar
+license: cc-by-nc-4.0
+tags:
+  - handwritten-text-recognition
+  - arabic
+  - khatt
+  - densenet
+  - transformer
+  - transfer-learning
+  - pytorch
+  - safetensors
+datasets:
+  - KHATT
+  - DASTNUS
+metrics:
+  - cer
+  - wer
+pipeline_tag: image-to-text
+---
+# Arabic Handwritten Text Recognition: DenseNet121-Transformer (Fine-tuned on KHATT)
+## Model Description
+A lightweight DenseNet121-Transformer architecture for Arabic handwritten line recognition,
+pre-trained on the Kurdish DASTNUS dataset and fine-tuned on the KHATT Arabic handwritten dataset.
+Uses a triple unified vocabulary covering Kurdish, Arabic, and Urdu scripts (192 tokens). The KHATT dataset is publicly available at https://www.kaggle.com/datasets/iraqyomar/khatt-arabic-hand-written-lines/code
+## Architecture
+- **CNN Backbone:** DenseNet-121 (pretrained on ImageNet)
+- **Encoder:** 3 Transformer encoder layers
+- **Decoder:** 3 Transformer decoder layers
+- **Attention Heads:** 8
+- **Hidden Size:** 256
+- **Parameters:** ~12.8M
+- **Vocabulary:** 192 tokens (Triple unified: Kurdish + Arabic + Urdu)
+## Transfer Learning Pipeline
+1. Pre-trained on Kurdish DASTNUS dataset (with unified vocabulary)
+2. Fine-tuned on KHATT Arabic handwritten line dataset
+## Performance on KHATT Test Set
+| Metric | Value |
+|--------|-------|
+| CER | 0.1135 |
+| WER | 0.4156 |
+| CRR | 88.65% |
+## Training Data
+- **Pre-training:** DASTNUS Kurdish handwritten dataset
+- **Fine-tuning:** KHATT Arabic handwritten dataset (5,166 training, 574 validation)
+## Usage
+```python
+from safetensors.torch import load_file
+import json
+# Load model weights
+state_dict = load_file("model.safetensors")
+# Load config
+with open("config.json", "r") as f:
+    config = json.load(f)
+# Load vocabulary
+with open("vocab.json", "r", encoding="utf-8") as f:
+    vocab = json.load(f)
+# Load full unified vocabulary info
+with open("unified_vocabulary.json", "r", encoding="utf-8") as f:
+    unified_vocab = json.load(f)
+```
+## Citation
+[]
+## License
+This model is released for non-commercial scientific research purposes only.