Add library_name metadata and link to GitHub (#1)
Browse files- Add library_name metadata and link to GitHub (05c1634c0e0cdca526fd75f963a75331c192ff33)
Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>
README.md
CHANGED
|
@@ -1,8 +1,11 @@
|
|
| 1 |
---
|
|
|
|
| 2 |
language:
|
| 3 |
- tr
|
| 4 |
- en
|
| 5 |
license: apache-2.0
|
|
|
|
|
|
|
| 6 |
tags:
|
| 7 |
- text-generation
|
| 8 |
- turkish
|
|
@@ -14,14 +17,17 @@ tags:
|
|
| 14 |
- continual-pretraining
|
| 15 |
- TRUBA
|
| 16 |
- MN5
|
| 17 |
-
base_model: Qwen/Qwen3-4B
|
| 18 |
-
pipeline_tag: text-generation
|
| 19 |
---
|
| 20 |
|
| 21 |
# Mecellem-Qwen3-4B-TR
|
| 22 |
|
| 23 |
[](https://opensource.org/licenses/Apache-2.0)
|
| 24 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
## Model Description
|
| 26 |
|
| 27 |
Mecellem-Qwen3-4B-TR is a Turkish legal language model adapted through Continual Pre-training (CPT) on Turkish legal and official texts. The model is based on Qwen3-4B decoder architecture (4B parameters) and trained using a single-phase, large-scale CPT process. Unlike the 1.7B model's four-phase curriculum learning, this model employs a single-phase training strategy on a comprehensive dataset, demonstrating that larger model capacity can benefit from direct large-scale domain adaptation.
|
|
@@ -177,7 +183,7 @@ If you use this model, please cite our paper:
|
|
| 177 |
```bibtex
|
| 178 |
@article{mecellem2026,
|
| 179 |
title={Mecellem Models: Turkish Models Trained from Scratch and Continually Pre-trained for the Legal Domain},
|
| 180 |
-
author={Uğur, Özgür and Göksu, Mahmut and Çimen, Mahmut and Yılmaz, Musa and Şavirdi, Esra and Demir, Alp Talha and Güllüce, Rumeysa and
|
| 181 |
journal={arXiv preprint arXiv:2601.16018},
|
| 182 |
year={2026},
|
| 183 |
month={January},
|
|
@@ -188,6 +194,7 @@ If you use this model, please cite our paper:
|
|
| 188 |
primaryClass={cs.CL}
|
| 189 |
}
|
| 190 |
```
|
|
|
|
| 191 |
### Base Model References
|
| 192 |
|
| 193 |
```bibtex
|
|
@@ -197,4 +204,4 @@ If you use this model, please cite our paper:
|
|
| 197 |
journal={arXiv preprint arXiv:2409.00000},
|
| 198 |
year={2024}
|
| 199 |
}
|
| 200 |
-
```
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model: Qwen/Qwen3-4B
|
| 3 |
language:
|
| 4 |
- tr
|
| 5 |
- en
|
| 6 |
license: apache-2.0
|
| 7 |
+
library_name: transformers
|
| 8 |
+
pipeline_tag: text-generation
|
| 9 |
tags:
|
| 10 |
- text-generation
|
| 11 |
- turkish
|
|
|
|
| 17 |
- continual-pretraining
|
| 18 |
- TRUBA
|
| 19 |
- MN5
|
|
|
|
|
|
|
| 20 |
---
|
| 21 |
|
| 22 |
# Mecellem-Qwen3-4B-TR
|
| 23 |
|
| 24 |
[](https://opensource.org/licenses/Apache-2.0)
|
| 25 |
|
| 26 |
+
This repository contains the **Mecellem-Qwen3-4B-TR** model, as presented in the paper [Mecellem Models: Turkish Models Trained from Scratch and Continually Pre-trained for the Legal Domain](https://huggingface.co/papers/2601.16018).
|
| 27 |
+
|
| 28 |
+
- **GitHub Repository:** [newmindai/mecellem-models](https://github.com/newmindai/mecellem-models)
|
| 29 |
+
- **Paper:** [arXiv:2601.16018](https://arxiv.org/abs/2601.16018)
|
| 30 |
+
|
| 31 |
## Model Description
|
| 32 |
|
| 33 |
Mecellem-Qwen3-4B-TR is a Turkish legal language model adapted through Continual Pre-training (CPT) on Turkish legal and official texts. The model is based on Qwen3-4B decoder architecture (4B parameters) and trained using a single-phase, large-scale CPT process. Unlike the 1.7B model's four-phase curriculum learning, this model employs a single-phase training strategy on a comprehensive dataset, demonstrating that larger model capacity can benefit from direct large-scale domain adaptation.
|
|
|
|
| 183 |
```bibtex
|
| 184 |
@article{mecellem2026,
|
| 185 |
title={Mecellem Models: Turkish Models Trained from Scratch and Continually Pre-trained for the Legal Domain},
|
| 186 |
+
author={Uğur, Özgür and Göksu, Mahmut and Çimen, Mahmut and Yılmaz, Musa and Şavirdi, Esra and Demir, Alp Talha and Güllüce, Rumeysa and İclal Çetin, Ömer Can Sağbaş},
|
| 187 |
journal={arXiv preprint arXiv:2601.16018},
|
| 188 |
year={2026},
|
| 189 |
month={January},
|
|
|
|
| 194 |
primaryClass={cs.CL}
|
| 195 |
}
|
| 196 |
```
|
| 197 |
+
|
| 198 |
### Base Model References
|
| 199 |
|
| 200 |
```bibtex
|
|
|
|
| 204 |
journal={arXiv preprint arXiv:2409.00000},
|
| 205 |
year={2024}
|
| 206 |
}
|
| 207 |
+
```
|