Instructions to use seemdog/manchuBERT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use seemdog/manchuBERT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="seemdog/manchuBERT")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("seemdog/manchuBERT") model = AutoModelForMaskedLM.from_pretrained("seemdog/manchuBERT") - Notebooks
- Google Colab
- Kaggle
Romanizing system of dataset
#1
by Comet0322 - opened
Hello, I am curious about which Romanization system is used for Manchu in your dataset. I use the Möllendorff system, but I found that characters like ū, š, and ž cannot be tokenized properly.
Abkai Latin transliteration was used. Please refer to our paper for more details.
https://arxiv.org/pdf/2311.17492
Thank you. I will check it out.