Instructions to use HPLT/hplt_bert_base_ja with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use HPLT/hplt_bert_base_ja with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="HPLT/hplt_bert_base_ja", trust_remote_code=True)# Load model directly from transformers import AutoModelForMaskedLM model = AutoModelForMaskedLM.from_pretrained("HPLT/hplt_bert_base_ja", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Update modeling_ltgbert.py
When initializing LtgbertForTokenClassification several LayerNorms don't have weight or bias.
And when using transformers>=4.40, two Metaspace's in tokenizer.json need "prepend_scheme" as follows:
{
"type": "Metaspace",
"replacement": "β",
"add_prefix_space": false,
"prepend_scheme": "never"
},
Hi, thank you very much for reporting these issues! I will look more into it next week. We're still discussing what to do about the Metaspace pretokenizer, its new behavior might silently break more things: https://huggingface.co/HPLT/hplt_bert_base_en/discussions/1
Thank you @davda54 for new tokenizer.json with https://huggingface.co/HPLT/hplt_bert_base_ja/commit/3ba81b4d5b8885c06c3a0c8f4c7feb79fefee1cb , well, how about modeling_ltgbert.py?
Hi, I'm really sorry that it took me so long! Thank you once again for your fix, it's now applied to the Japanese BERT as well as to other HPLT-BERT models :)