artiwise-ai
/

modernbert-base-tr-uncased

@@ -14,7 +14,7 @@ pipeline_tag: fill-mask
 [![trmodernbert.webp](https://huggingface.co/artiwise-ai/modernbert-base-tr-uncased/resolve/main/trmodernbert.webp)](https://huggingface.co/artiwise-ai/modernbert-base-tr-uncased/resolve/main/trmodernbert.webp)
-We present Artiwise ModernBERT for Turkish 🎉.
 This model is a Turkish adaptation of ModernBERT, fine-tuned from `answerdotai/ModernBERT-base` using only the Turkish part of CulturaX.
@@ -28,20 +28,19 @@ The benchmark results below demonstrate that Artiwise ModernBERT consistently ou
 | Dataset & Mask Level                  | Artiwise Modern Bert | ytu-ce-cosmos/turkish-base-bert-uncased | dbmdz/bert-base-turkish-uncased |
 |--------------------------------------|----------------------|----------------------------------------|----------------------------------|
-| QA Dataset (5% mask)                 | **74.50**            | 60.84                                 | 48.57                           |
-| QA Dataset (10% mask)                | **72.18**            | 58.75                                 | 46.29                           |
-| QA Dataset (15% mask)                | **69.46**            | 56.50                                 | 44.30                           |
-| Review Dataset (5% mask)             | **62.04**            | 48.31                                 | 35.50                           |
-| Review Dataset (10% mask)            | **59.52**            | 45.88                                 | 33.74                           |
-| Review Dataset (15% mask)            | **56.32**            | 43.12                                 | 31.70                           |
-| Biomedical Dataset* (5% mask)        | **58.11**            | 50.78                                 | 40.82                           |
-| Biomedical Dataset* (10% mask)       | **55.55**            | 48.37                                 | 38.51                           |
-| Biomedical Dataset* (15% mask)       | **52.71**            | 45.82                                 | 36.44                           |
-\* *Only the first 100,000 entries from the Biomedical Dataset were used for evaluation.*
 # Model Usage
 Note: Torch version must be >= 2.6.0 and transformers version>=4.50.0 for the model to function properly.

 [![trmodernbert.webp](https://huggingface.co/artiwise-ai/modernbert-base-tr-uncased/resolve/main/trmodernbert.webp)](https://huggingface.co/artiwise-ai/modernbert-base-tr-uncased/resolve/main/trmodernbert.webp)
+We present Artiwise ModernBERT for Turkish 🎉. A BERT model with modernized architecture and increased context size (512 --> 8192).
 This model is a Turkish adaptation of ModernBERT, fine-tuned from `answerdotai/ModernBERT-base` using only the Turkish part of CulturaX.
 | Dataset & Mask Level                  | Artiwise Modern Bert | ytu-ce-cosmos/turkish-base-bert-uncased | dbmdz/bert-base-turkish-uncased |
 |--------------------------------------|----------------------|----------------------------------------|----------------------------------|
+| QA Dataset (5% mask)                 | **74.50**            | 60.84                                  | 48.57                           |
+| QA Dataset (10% mask)                | **72.18**            | 58.75                                  | 46.29                           |
+| QA Dataset (15% mask)                | **69.46**            | 56.50                                  | 44.30                           |
+| Review Dataset (5% mask)             | **62.67**            | 48.57                                  | 35.38                           |
+| Review Dataset (10% mask)            | **59.60**            | 45.77                                  | 33.04                           |
+| Review Dataset (15% mask)            | **56.51**            | 43.05                                  | 31.05                           |
+| Biomedical Dataset (5% mask)        | **58.11**            | 50.78                                  | 40.82                           |
+| Biomedical Dataset (10% mask)       | **55.55**            | 48.37                                  | 38.51                           |
+| Biomedical Dataset (15% mask)       | **52.71**            | 45.82                                  | 36.44                           |
+For each dataset (QA, Reviews, Biomedical) and each masking level (5 %, 10 %, 15 %), we randomly masked the specified percentage of tokens in every input example and then measured each model’s ability to correctly predict those masked tokens. All models were in bfloat16 precision.
+Our experiments used three datasets: the [Turkish Biomedical Corpus](https://huggingface.co/hazal/Turkish-Biomedical-corpus-trM), the [Turkish Product Reviews dataset](https://huggingface.co/fthbrmnby/turkish_product_reviews), and the general‑domain QA corpus [turkish\_v2](https://huggingface.co/blackerx/turkish_v2).
 # Model Usage
 Note: Torch version must be >= 2.6.0 and transformers version>=4.50.0 for the model to function properly.