alexaapo commited on
Commit
515d4d3
·
verified ·
1 Parent(s): 85f9445

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -17,11 +17,11 @@ base_model:
17
  - answerdotai/ModernBERT-base
18
  ---
19
 
20
- # GEM-ModernBERT Legal: A Greek Legal Language Model with Advanced Optimization
21
 
22
  ## Model Description
23
 
24
- **GEM-ModernBERT Legal** is a ModernBERT-base model pre-trained from scratch on a strategically curated 21GB corpus of Greek legal, parliamentary, and governmental text. This model leverages ModernBERT's cutting-edge architectural innovations including **Flash Attention 2**, **StableAdamW optimizer**, **1024-token context length**, and **advanced memory optimization** techniques to deliver superior performance on Greek legal document understanding tasks.
25
 
26
  Building upon our proven **quality-based data repetition strategy**, this model incorporates ModernBERT's state-of-the-art training methodology with **30% masking probability**, **trapezoidal learning rate scheduling**, and **optimized batch sizing** for enhanced convergence and performance. The model is specifically designed to handle longer legal documents with its extended 1024-token context window while maintaining computational efficiency through advanced optimization techniques.
27
 
 
17
  - answerdotai/ModernBERT-base
18
  ---
19
 
20
+ # GEM-ModernBERT HQ Legal: A Greek Legal Language Model with Advanced Optimization
21
 
22
  ## Model Description
23
 
24
+ **GEM-ModernBERT HQ Legal** is a ModernBERT-base model pre-trained from scratch on a strategically curated 21GB corpus of Greek legal, parliamentary, and governmental text. This model leverages ModernBERT's cutting-edge architectural innovations including **Flash Attention 2**, **StableAdamW optimizer**, **1024-token context length**, and **advanced memory optimization** techniques to deliver superior performance on Greek legal document understanding tasks.
25
 
26
  Building upon our proven **quality-based data repetition strategy**, this model incorporates ModernBERT's state-of-the-art training methodology with **30% masking probability**, **trapezoidal learning rate scheduling**, and **optimized batch sizing** for enhanced convergence and performance. The model is specifically designed to handle longer legal documents with its extended 1024-token context window while maintaining computational efficiency through advanced optimization techniques.
27