Update README.md
Browse files
README.md
CHANGED
|
@@ -9,46 +9,21 @@ pipeline_tag: fill-mask
|
|
| 9 |
This is a BERT-base model trained with romanized Manchu data from scratch.
|
| 10 |
|
| 11 |
|
| 12 |
-
|
| 13 |
-
- **Model type:** [More Information Needed]
|
| 14 |
-
- **Language(s) (NLP):** [More Information Needed]
|
| 15 |
-
- **License:** [More Information Needed]
|
| 16 |
-
- **Finetuned from model [optional]:** [More Information Needed]
|
| 17 |
|
|
|
|
| 18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
|
| 20 |
|
| 21 |
-
## Bias, Risks, and Limitations
|
| 22 |
-
|
| 23 |
-
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
## Training Details
|
| 29 |
-
|
| 30 |
-
### Training Data
|
| 31 |
-
|
| 32 |
-
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
| 33 |
-
|
| 34 |
-
[More Information Needed]
|
| 35 |
-
|
| 36 |
-
### Training Procedure
|
| 37 |
-
|
| 38 |
-
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
#### Training Hyperparameters
|
| 42 |
-
|
| 43 |
-
- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
[More Information Needed]
|
| 47 |
-
|
| 48 |
-
## Evaluation
|
| 49 |
-
|
| 50 |
-
<!-- This section describes the evaluation protocols and provides the results. -->
|
| 51 |
-
|
| 52 |
|
| 53 |
## Citation [optional]
|
| 54 |
|
|
|
|
| 9 |
This is a BERT-base model trained with romanized Manchu data from scratch.
|
| 10 |
|
| 11 |
|
| 12 |
+
# data
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
|
| 14 |
+
manchuBERT utilizes the augmented data following the augmentation method from https://arxiv.org/pdf/2311.17492.pdf
|
| 15 |
|
| 16 |
+
| **Data ** | **Number of Sentences** |
|
| 17 |
+
|:---------------------------:|:-----------------------:|
|
| 18 |
+
| Manwén Lˇaodàng–Taizong | 2,220 |
|
| 19 |
+
| Ilan gurun i bithe | 41,904 |
|
| 20 |
+
| Gin ping mei bithe | 21,376 |
|
| 21 |
+
| Yùzhì Q¯ıngwénjiàn | 11,954 |
|
| 22 |
+
| Yùzhì Zengdìng Q¯ıngwénjiàn | 18,420 |
|
| 23 |
+
| Manwén Lˇaodàng–Taizu | 22,578 |
|
| 24 |
+
| Manchu-Korean Dictionary | 40,583 |
|
| 25 |
|
| 26 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
|
| 28 |
## Citation [optional]
|
| 29 |
|