neavo commited on
Commit
6c867b4
·
verified ·
1 Parent(s): e99e560

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -8
README.md CHANGED
@@ -9,12 +9,13 @@ license: apache-2.0
9
  ---
10
 
11
  ### Overview
12
- - ModernBertMultilingual is a multilingual model trained from scratch, using the [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) architecture.
13
  - It supports four languages and their variants, including `Simplified Chinese`, `Traditional Chinese`, `English`, `Japanese`, and `Korean`
14
- - And can effectively handle mixed-text tasks in East Asian languages.
15
 
16
  ### Technical Metrics
17
- - Trained for approximately `100` hours on `L40*7` devices, with a training volume of about `60B` tokens.
 
18
  - Main training parameters:
19
  - Batch Size: 1792
20
  - Learning Rate: 5e-04
@@ -22,13 +23,13 @@ license: apache-2.0
22
  - Optimizer: adamw_torch
23
  - LR Scheduler: warmup_stable_decay
24
  - Train Precision: bf16 mix
25
- - For additional technical metrics, please refer to the original release information and papers of [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base).
26
 
27
  ### Release Versions
28
  - Three different weight versions are provided:
29
- - base: The version trained with general base data, suitable for various domain texts (default).
30
- - nodecay: The checkpoint before the annealing phase begins, which allows you to add domain-specific data for annealing to better adapt to the target domain.
31
- - keyword_gacha_multilingual: The version annealed with ACGN-related texts (e.g., `light novels`, `game scripts`, `comic scripts`, etc.).
32
 
33
  | Model | Version | Description |
34
  | :--: | :--: | :--: |
@@ -37,7 +38,7 @@ license: apache-2.0
37
  | [keyword_gacha_base_multilingual](https://huggingface.co/neavo/keyword_gacha_base_multilingual) | 20250128 | keyword_gacha_multilingual |
38
 
39
  ### Others
40
- - Training script available on [Github](https://github.com/neavo/KeywordGachaModel).
41
 
42
  ### 综述
43
  - ModernBertMultilingual 是一个从零开始训练的多语言模型
@@ -46,6 +47,7 @@ license: apache-2.0
46
  - 可以很好处理东亚语言混合文本任务
47
 
48
  ### 技术指标
 
49
  - 在 `L40*7` 的设备上训练了大约 `100` 个小时,训练量大约 `60B` Token
50
  - 主要训练参数
51
  - Batch Size : 1792
 
9
  ---
10
 
11
  ### Overview
12
+ - ModernBertMultilingual is a multilingual model trained from scratch, using the [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) architecture
13
  - It supports four languages and their variants, including `Simplified Chinese`, `Traditional Chinese`, `English`, `Japanese`, and `Korean`
14
+ - And can effectively handle mixed-text tasks in East Asian languages
15
 
16
  ### Technical Metrics
17
+ - Using a slightly modified vocabulary of the Qwen2.5 series to support multilingual capabilities
18
+ - Trained for approximately `100` hours on `L40*7` devices, with a training volume of about `60B` tokens
19
  - Main training parameters:
20
  - Batch Size: 1792
21
  - Learning Rate: 5e-04
 
23
  - Optimizer: adamw_torch
24
  - LR Scheduler: warmup_stable_decay
25
  - Train Precision: bf16 mix
26
+ - For additional technical metrics, please refer to the original release information and papers of [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base)
27
 
28
  ### Release Versions
29
  - Three different weight versions are provided:
30
+ - base: The version trained with general base data, suitable for various domain texts (default)
31
+ - nodecay: The checkpoint before the annealing phase begins, which allows you to add domain-specific data for annealing to better adapt to the target domain
32
+ - keyword_gacha_multilingual: The version annealed with ACGN-related texts (e.g., `light novels`, `game scripts`, `comic scripts`, etc.)
33
 
34
  | Model | Version | Description |
35
  | :--: | :--: | :--: |
 
38
  | [keyword_gacha_base_multilingual](https://huggingface.co/neavo/keyword_gacha_base_multilingual) | 20250128 | keyword_gacha_multilingual |
39
 
40
  ### Others
41
+ - Training script available on [Github](https://github.com/neavo/KeywordGachaModel)
42
 
43
  ### 综述
44
  - ModernBertMultilingual 是一个从零开始训练的多语言模型
 
47
  - 可以很好处理东亚语言混合文本任务
48
 
49
  ### 技术指标
50
+ - 使用略微调整后的 `Qwen2.5` 系列的词表以支持多语言
51
  - 在 `L40*7` 的设备上训练了大约 `100` 个小时,训练量大约 `60B` Token
52
  - 主要训练参数
53
  - Batch Size : 1792