Instructions to use google-bert/bert-base-chinese with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google-bert/bert-base-chinese with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="google-bert/bert-base-chinese")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-chinese") model = AutoModelForMaskedLM.from_pretrained("google-bert/bert-base-chinese") - Inference
- Notebooks
- Google Colab
- Kaggle
Model Card
Browse filesHi!👋
This PR has a some additional information for the model card, based on the format we are using as part of our effort to standardise model cards at Hugging Face. Feel free to merge if you are ok with the changes! (cc @Marissa @Meg )
README.md
CHANGED
|
@@ -1,3 +1,71 @@
|
|
| 1 |
---
|
| 2 |
language: zh
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
language: zh
|
| 3 |
---
|
| 4 |
+
|
| 5 |
+
# Bert-base-chinese
|
| 6 |
+
|
| 7 |
+
## Table of Contents
|
| 8 |
+
- [Model Details](#model-details)
|
| 9 |
+
- [Uses](#uses)
|
| 10 |
+
- [Risks, Limitations and Biases](#risks-limitations-and-biases)
|
| 11 |
+
- [Training](#training)
|
| 12 |
+
- [Evaluation](#evaluation)
|
| 13 |
+
- [How to Get Started With the Model](#how-to-get-started-with-the-model)
|
| 14 |
+
|
| 15 |
+
|
| 16 |
+
# Model Details
|
| 17 |
+
- **Model Description:**
|
| 18 |
+
This model has been pre-trained for Chinese, training and random input masking has been applied independently to word pieces (as in the original BERT paper).
|
| 19 |
+
|
| 20 |
+
- **Developed by:** HuggingFace team
|
| 21 |
+
- **Model Type:** Fill-Mask
|
| 22 |
+
- **Language(s):** Chinese
|
| 23 |
+
- **License:** [More Information needed]
|
| 24 |
+
- **Parent Model:** See the [BERT base uncased model](https://huggingface.co/bert-base-uncased) for more information about the BERT base model.
|
| 25 |
+
|
| 26 |
+
|
| 27 |
+
## Uses
|
| 28 |
+
|
| 29 |
+
#### Direct Use
|
| 30 |
+
|
| 31 |
+
This model can be used for masked language modeling
|
| 32 |
+
|
| 33 |
+
|
| 34 |
+
|
| 35 |
+
## Risks, Limitations and Biases
|
| 36 |
+
**CONTENT WARNING: Readers should be aware this section contains content that is disturbing, offensive, and can propagate historical and current stereotypes.**
|
| 37 |
+
|
| 38 |
+
Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)).
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
## Training
|
| 42 |
+
|
| 43 |
+
#### Training Procedure
|
| 44 |
+
* **type_vocab_size:** 2
|
| 45 |
+
* **vocab_size:** 21128
|
| 46 |
+
* **num_hidden_layers:** 12
|
| 47 |
+
|
| 48 |
+
#### Training Data
|
| 49 |
+
[More Information Needed]
|
| 50 |
+
|
| 51 |
+
## Evaluation
|
| 52 |
+
|
| 53 |
+
#### Results
|
| 54 |
+
|
| 55 |
+
[More Information Needed]
|
| 56 |
+
|
| 57 |
+
|
| 58 |
+
## How to Get Started With the Model
|
| 59 |
+
```python
|
| 60 |
+
from transformers import AutoTokenizer, AutoModelForMaskedLM
|
| 61 |
+
|
| 62 |
+
tokenizer = AutoTokenizer.from_pretrained("bert-base-chinese")
|
| 63 |
+
|
| 64 |
+
model = AutoModelForMaskedLM.from_pretrained("bert-base-chinese")
|
| 65 |
+
|
| 66 |
+
```
|
| 67 |
+
|
| 68 |
+
|
| 69 |
+
|
| 70 |
+
|
| 71 |
+
|