Buckets:

hf-doc-build/doc-dev / tokenizers /pr_2113 /en /api /encode-inputs.md
|
download
raw
2.08 kB

Encode Inputs

These types represent all the different kinds of input that a Tokenizer accepts when using encode_batch().

TextEncodeInput[[[[tokenizers.TextEncodeInput]]]]

tokenizers.TextEncodeInput

Represents a textual input for encoding. Can be either:

alias of Union[str, Tuple[str, str], List[str]].

PreTokenizedEncodeInput[[[[tokenizers.PreTokenizedEncodeInput]]]]

tokenizers.PreTokenizedEncodeInput

Represents a pre-tokenized input for encoding. Can be either:

alias of Union[List[str], Tuple[str], Tuple[Union[List[str], Tuple[str]], Union[List[str], Tuple[str]]], List[Union[List[str], Tuple[str]]]].

EncodeInput[[[[tokenizers.EncodeInput]]]]

tokenizers.EncodeInput

Represents all the possible types of input for encoding. Can be:

alias of Union[str, Tuple[str, str], List[str], Tuple[str], Tuple[Union[List[str], Tuple[str]], Union[List[str], Tuple[str]]], List[Union[List[str], Tuple[str]]]].

The Rust API Reference is available directly on the Docs.rs website.

The node API has not been documented yet.

Xet Storage Details

Size:
2.08 kB
·
Xet hash:
115b11262680a608c117c2ea70c7fc237b3e9723d6ffd0666ec3f597f00ac79b

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.