File size: 4,568 Bytes
72c0672
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
.. entities:: python

    :global:

    class
        class
    classmethod
        class method
    Tokenizer
        :class:`~tokenizers.Tokenizer`
    Tokenizer.train
        :meth:`~tokenizers.Tokenizer.train`
    Tokenizer.save
        :meth:`~tokenizers.Tokenizer.save`
    Tokenizer.from_file
        :meth:`~tokenizers.Tokenizer.from_file`
    Tokenizer.encode
        :meth:`~tokenizers.Tokenizer.encode`
    Tokenizer.encode_batch
        :meth:`~tokenizers.Tokenizer.encode_batch`
    Tokenizer.decode
        :meth:`~tokenizers.Tokenizer.decode`
    Tokenizer.decode_batch
        :meth:`~tokenizers.Tokenizer.decode_batch`
    Tokenizer.token_to_id
        :meth:`~tokenizers.Tokenizer.token_to_id`
    Tokenizer.enable_padding
        :meth:`~tokenizers.Tokenizer.enable_padding`
    Encoding
        :class:`~tokenizers.Encoding`
    TemplateProcessing
        :class:`~tokenizers.processors.TemplateProcessing`
    Normalizer
        :class:`~tokenizers.normalizers.Normalizer`
    normalizers.Sequence
        :class:`~tokenizers.normalizers.Sequence`
    pre_tokenizers.Whitespace
        :class:`~tokenizers.pre_tokenizers.Whitespace`
    PreTokenizer
        :class:`~tokenizers.pre_tokenizers.PreTokenizer`
    models.BPE
        :class:`~tokenizers.models.BPE`
    models.Unigram
        :class:`~tokenizers.models.Unigram`
    models.WordLevel
        :class:`~tokenizers.models.WordLevel`
    models.WordPiece
        :class:`~tokenizers.models.WordPiece`
    Decoder
        :class:`~tokenizers.decoders.Decoder`

.. entities:: rust

    :global:

    class
        struct
    classmethod
        static method
    Tokenizer
        :rust_struct:`~tokenizers::tokenizer::Tokenizer`
    Tokenizer.train
        :rust_meth:`~tokenizers::tokenizer::Tokenizer::train`
    Tokenizer.save
        :rust_meth:`~tokenizers::tokenizer::Tokenizer::save`
    Tokenizer.from_file
        :rust_meth:`~tokenizers::tokenizer::Tokenizer::from_file`
    Tokenizer.encode
        :rust_meth:`~tokenizers::tokenizer::Tokenizer::encode`
    Tokenizer.encode_batch
        :rust_meth:`~tokenizers::tokenizer::Tokenizer::encode_batch`
    Tokenizer.decode
        :rust_meth:`~tokenizers::tokenizer::Tokenizer::decode`
    Tokenizer.decode_batch
        :rust_meth:`~tokenizers::tokenizer::Tokenizer::decode_batch`
    Tokenizer.token_to_id
        :rust_meth:`~tokenizers::tokenizer::Tokenizer::token_to_id`
    Tokenizer.enable_padding
        :rust_meth:`~tokenizers::tokenizer::Tokenizer::enable_padding`
    Encoding
        :rust_struct:`~tokenizers::tokenizer::Encoding`
    TemplateProcessing
        :rust_struct:`~tokenizers::processors::template::TemplateProcessing`
    Normalizer
        :rust_trait:`~tokenizers::tokenizer::Normalizer`
    normalizers.Sequence
        :rust_struct:`~tokenizers::normalizers::utils::Sequence`
    pre_tokenizers.Whitespace
        :rust_struct:`~tokenizers::normalizers::whitespace::Whitespace`
    PreTokenizer
        :rust_trait:`~tokenizers::tokenizer::PreTokenizer`
    models.BPE
        :rust_struct:`~tokenizers::models::bpe::BPE`
    models.Unigram
        :rust_struct:`~tokenizers::models::unigram::Unigram`
    models.WordLevel
        :rust_struct:`~tokenizers::models::wordlevel::WordLevel`
    models.WordPiece
        :rust_struct:`~tokenizers::models::wordpiece::WordPiece`
    Decoder
        :rust_trait:`~tokenizers::tokenizer::Decoder`

.. entities:: node

    :global:

    class
        class
    classmethod
        static method
    Tokenizer
        :obj:`Tokenizer`
    Tokenizer.train
        :obj:`Tokenizer.train()`
    Tokenizer.save
        :obj:`Tokenizer.save()`
    Tokenizer.from_file
        :obj:`Tokenizer.fromFile()`
    Tokenizer.encode
        :obj:`Tokenizer.encode()`
    Tokenizer.encode_batch
        :obj:`Tokenizer.encodeBatch()`
    Tokenizer.decode
        :obj:`Tokenizer.decode()`
    Tokenizer.decode_batch
        :obj:`Tokenizer.decodeBatch()`
    Tokenizer.token_to_id
        :obj:`Tokenizer.tokenToId()`
    Tokenizer.enable_padding
        :obj:`Tokenizer.setPadding()`
    Encoding
        :obj:`Encoding`
    TemplateProcessing
        :obj:`TemplateProcessing`
    Normalizer
        :obj:`Normalizer`
    normalizers.Sequence
        :obj:`Sequence`
    pre_tokenizers.Whitespace
        :obj:`Whitespace`
    PreTokenizer
        :obj:`PreTokenizer`
    models.BPE
        :obj:`BPE`
    models.Unigram
        :obj:`Unigram`
    models.WordLevel
        :obj:`WordLevel`
    models.WordPiece
        :obj:`WordPiece`
    Decoder
        :obj:`Decoder`