mpolacek commited on
Commit
ca8eb11
·
verified ·
1 Parent(s): ae99253

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -12,6 +12,7 @@ language:
12
  - en
13
  - da
14
  - de
 
15
  license: cc-by-4.0
16
  tags:
17
  - pretraining
@@ -19,7 +20,7 @@ tags:
19
 
20
  # mELECTRA (Multilingual ELECTRA)
21
 
22
- mELECTRA is an [Electra](https://arxiv.org/abs/2003.10555)-based model pretrained on a diverse multilingual corpus. It supports multiple languages, including **Swedish (SE), Slovenian (SL), Slovak (SK), Portuguese (PT), Polish (PL), Norwegian (NO), Italian (IT), Croatian (HR), French (FR), English (EN), Danish (DK), German (DE), and Czech (CZ)**. The model can be fine-tuned for various NLP tasks such as text classification, named entity recognition, and masked token prediction.
23
 
24
  This model is released under the [CC BY 4.0 license](https://creativecommons.org/licenses/by/4.0/), allowing commercial use. If you encounter any issues, please visit our [GitHub repository](https://github.com/your-repo/mELECTRA).
25
 
@@ -28,7 +29,7 @@ This model is released under the [CC BY 4.0 license](https://creativecommons.org
28
  ## Model Details
29
 
30
  - **Architecture:** ELECTRA-Small
31
- - **Languages Supported:** Swedish, Slovenian, Slovak, Portuguese, Polish, Norwegian, Italian, Croatian, French, English, Danish, German, Czech
32
  - **Pretraining Data:** Multilingual corpus (news articles, Wikipedia, and web texts)
33
  - **Vocabulary:** SentencePiece-based tokenizer (`m.model`)
34
 
 
12
  - en
13
  - da
14
  - de
15
+ - sp
16
  license: cc-by-4.0
17
  tags:
18
  - pretraining
 
20
 
21
  # mELECTRA (Multilingual ELECTRA)
22
 
23
+ mELECTRA is an [Electra](https://arxiv.org/abs/2003.10555)-based model pretrained on a diverse multilingual corpus. It supports multiple languages, including **Swedish (SE), Slovenian (SL), Slovak (SK), Spanish (SP), Portuguese (PT), Polish (PL), Norwegian (NO), Italian (IT), Croatian (HR), French (FR), English (EN), Danish (DK), German (DE), and Czech (CZ)**. The model can be fine-tuned for various NLP tasks such as text classification, named entity recognition, and masked token prediction.
24
 
25
  This model is released under the [CC BY 4.0 license](https://creativecommons.org/licenses/by/4.0/), allowing commercial use. If you encounter any issues, please visit our [GitHub repository](https://github.com/your-repo/mELECTRA).
26
 
 
29
  ## Model Details
30
 
31
  - **Architecture:** ELECTRA-Small
32
+ - **Languages Supported:** Swedish, Slovenian, Slovak, Portuguese, Spanish, Polish, Norwegian, Italian, Croatian, French, English, Danish, German, Czech
33
  - **Pretraining Data:** Multilingual corpus (news articles, Wikipedia, and web texts)
34
  - **Vocabulary:** SentencePiece-based tokenizer (`m.model`)
35