Update README.md
Browse files
README.md
CHANGED
|
@@ -17,11 +17,11 @@ base_model:
|
|
| 17 |
- answerdotai/ModernBERT-base
|
| 18 |
---
|
| 19 |
|
| 20 |
-
# GEM-ModernBERT Legal: A Greek Legal Language Model with Advanced Optimization
|
| 21 |
|
| 22 |
## Model Description
|
| 23 |
|
| 24 |
-
**GEM-ModernBERT Legal** is a ModernBERT-base model pre-trained from scratch on a strategically curated 21GB corpus of Greek legal, parliamentary, and governmental text. This model leverages ModernBERT's cutting-edge architectural innovations including **Flash Attention 2**, **StableAdamW optimizer**, **1024-token context length**, and **advanced memory optimization** techniques to deliver superior performance on Greek legal document understanding tasks.
|
| 25 |
|
| 26 |
Building upon our proven **quality-based data repetition strategy**, this model incorporates ModernBERT's state-of-the-art training methodology with **30% masking probability**, **trapezoidal learning rate scheduling**, and **optimized batch sizing** for enhanced convergence and performance. The model is specifically designed to handle longer legal documents with its extended 1024-token context window while maintaining computational efficiency through advanced optimization techniques.
|
| 27 |
|
|
|
|
| 17 |
- answerdotai/ModernBERT-base
|
| 18 |
---
|
| 19 |
|
| 20 |
+
# GEM-ModernBERT HQ Legal: A Greek Legal Language Model with Advanced Optimization
|
| 21 |
|
| 22 |
## Model Description
|
| 23 |
|
| 24 |
+
**GEM-ModernBERT HQ Legal** is a ModernBERT-base model pre-trained from scratch on a strategically curated 21GB corpus of Greek legal, parliamentary, and governmental text. This model leverages ModernBERT's cutting-edge architectural innovations including **Flash Attention 2**, **StableAdamW optimizer**, **1024-token context length**, and **advanced memory optimization** techniques to deliver superior performance on Greek legal document understanding tasks.
|
| 25 |
|
| 26 |
Building upon our proven **quality-based data repetition strategy**, this model incorporates ModernBERT's state-of-the-art training methodology with **30% masking probability**, **trapezoidal learning rate scheduling**, and **optimized batch sizing** for enhanced convergence and performance. The model is specifically designed to handle longer legal documents with its extended 1024-token context window while maintaining computational efficiency through advanced optimization techniques.
|
| 27 |
|