transformers / docs /source /en /internal /tokenization_utils.md
AbdulElahGwaith's picture
Upload folder using huggingface_hub
a9bd396 verified

Utilities for Tokenizers

This page lists all the utility functions used by the tokenizers, mainly the class [~tokenization_utils_base.PreTrainedTokenizerBase] that implements the common methods between [PreTrainedTokenizer] and [PreTrainedTokenizerFast].

Most of those are only useful if you are studying the code of the tokenizers in the library.

PreTrainedTokenizerBase

[[autodoc]] tokenization_utils_base.PreTrainedTokenizerBase - call - all

Enums and namedtuples

[[autodoc]] tokenization_utils_base.TruncationStrategy

[[autodoc]] tokenization_utils_base.CharSpan

[[autodoc]] tokenization_utils_base.TokenSpan