benjamin/compoundpiece
Viewer • Updated • 44.2M • 175 • 1
How to use benjamin/compoundpiece-stage1 with Transformers:
# Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("benjamin/compoundpiece-stage1")
model = AutoModelForSeq2SeqLM.from_pretrained("benjamin/compoundpiece-stage1")CompoundPiece model trained only on Stage 1 training data (self-supervised training on hyphenated and non-hyphenated words scraped from the web). See CompoundPiece: Evaluating and Improving Decompounding Performance of Language Models.
@article{minixhofer2023compoundpiece,
title={CompoundPiece: Evaluating and Improving Decompounding Performance of Language Models},
author={Minixhofer, Benjamin and Pfeiffer, Jonas and Vuli{\'c}, Ivan},
journal={arXiv preprint arXiv:2305.14214},
year={2023}
}
MIT
# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("benjamin/compoundpiece-stage1") model = AutoModelForSeq2SeqLM.from_pretrained("benjamin/compoundpiece-stage1")