How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("fill-mask", model="Geotrend/bert-base-zh-cased")
# Load model directly
from transformers import AutoTokenizer, AutoModelForMaskedLM

tokenizer = AutoTokenizer.from_pretrained("Geotrend/bert-base-zh-cased")
model = AutoModelForMaskedLM.from_pretrained("Geotrend/bert-base-zh-cased")
Quick Links

bert-base-zh-cased

We are sharing smaller versions of bert-base-multilingual-cased that handle a custom number of languages.

Unlike distilbert-base-multilingual-cased, our versions give exactly the same representations produced by the original model which preserves the original accuracy.

For more information please visit our paper: Load What You Need: Smaller Versions of Multilingual BERT.

How to use

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("Geotrend/bert-base-zh-cased")
model = AutoModel.from_pretrained("Geotrend/bert-base-zh-cased")

To generate other smaller versions of multilingual transformers please visit our Github repo.

How to cite

@inproceedings{smallermbert,
  title={Load What You Need: Smaller Versions of Mutlilingual BERT},
  author={Abdaoui, Amine and Pradel, Camille and Sigel, GrΓ©goire},
  booktitle={SustaiNLP / EMNLP},
  year={2020}
}

Contact

Please contact amine@geotrend.fr for any question, feedback or request.

Downloads last month
13
Safetensors
Model size
96M params
Tensor type
I64
Β·
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train Geotrend/bert-base-zh-cased

Space using Geotrend/bert-base-zh-cased 1