JinghuiLuAstronaut commited on
Commit
9566c86
·
verified ·
1 Parent(s): 102c73a

Add files using upload-large-folder tool

Browse files
Files changed (20) hide show
  1. LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/huggingface_hub-1.16.4.dist-info/licenses/LICENSE +201 -0
  2. LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/safetensors/__init__.py +10 -0
  3. LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/safetensors/__init__.pyi +164 -0
  4. LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/safetensors/paddle.py +290 -0
  5. LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/safetensors/py.typed +0 -0
  6. LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/transformers/models/code_llama/__init__.py +26 -0
  7. LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/transformers/models/code_llama/tokenization_code_llama.py +358 -0
  8. LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/transformers/models/mbart/__init__.py +28 -0
  9. LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/transformers/models/mbart/tokenization_mbart.py +208 -0
  10. LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typer-0.25.1.dist-info/INSTALLER +1 -0
  11. LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typer-0.25.1.dist-info/METADATA +408 -0
  12. LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typer-0.25.1.dist-info/RECORD +26 -0
  13. LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typer-0.25.1.dist-info/REQUESTED +0 -0
  14. LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typer-0.25.1.dist-info/WHEEL +4 -0
  15. LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typer-0.25.1.dist-info/entry_points.txt +5 -0
  16. LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typing_extensions-4.15.0.dist-info/METADATA +72 -0
  17. LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typing_extensions-4.15.0.dist-info/RECORD +7 -0
  18. LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typing_extensions-4.15.0.dist-info/REQUESTED +0 -0
  19. LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typing_extensions-4.15.0.dist-info/WHEEL +4 -0
  20. LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typing_extensions-4.15.0.dist-info/licenses/LICENSE +279 -0
LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/huggingface_hub-1.16.4.dist-info/licenses/LICENSE ADDED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Apache License
2
+ Version 2.0, January 2004
3
+ http://www.apache.org/licenses/
4
+
5
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6
+
7
+ 1. Definitions.
8
+
9
+ "License" shall mean the terms and conditions for use, reproduction,
10
+ and distribution as defined by Sections 1 through 9 of this document.
11
+
12
+ "Licensor" shall mean the copyright owner or entity authorized by
13
+ the copyright owner that is granting the License.
14
+
15
+ "Legal Entity" shall mean the union of the acting entity and all
16
+ other entities that control, are controlled by, or are under common
17
+ control with that entity. For the purposes of this definition,
18
+ "control" means (i) the power, direct or indirect, to cause the
19
+ direction or management of such entity, whether by contract or
20
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
21
+ outstanding shares, or (iii) beneficial ownership of such entity.
22
+
23
+ "You" (or "Your") shall mean an individual or Legal Entity
24
+ exercising permissions granted by this License.
25
+
26
+ "Source" form shall mean the preferred form for making modifications,
27
+ including but not limited to software source code, documentation
28
+ source, and configuration files.
29
+
30
+ "Object" form shall mean any form resulting from mechanical
31
+ transformation or translation of a Source form, including but
32
+ not limited to compiled object code, generated documentation,
33
+ and conversions to other media types.
34
+
35
+ "Work" shall mean the work of authorship, whether in Source or
36
+ Object form, made available under the License, as indicated by a
37
+ copyright notice that is included in or attached to the work
38
+ (an example is provided in the Appendix below).
39
+
40
+ "Derivative Works" shall mean any work, whether in Source or Object
41
+ form, that is based on (or derived from) the Work and for which the
42
+ editorial revisions, annotations, elaborations, or other modifications
43
+ represent, as a whole, an original work of authorship. For the purposes
44
+ of this License, Derivative Works shall not include works that remain
45
+ separable from, or merely link (or bind by name) to the interfaces of,
46
+ the Work and Derivative Works thereof.
47
+
48
+ "Contribution" shall mean any work of authorship, including
49
+ the original version of the Work and any modifications or additions
50
+ to that Work or Derivative Works thereof, that is intentionally
51
+ submitted to Licensor for inclusion in the Work by the copyright owner
52
+ or by an individual or Legal Entity authorized to submit on behalf of
53
+ the copyright owner. For the purposes of this definition, "submitted"
54
+ means any form of electronic, verbal, or written communication sent
55
+ to the Licensor or its representatives, including but not limited to
56
+ communication on electronic mailing lists, source code control systems,
57
+ and issue tracking systems that are managed by, or on behalf of, the
58
+ Licensor for the purpose of discussing and improving the Work, but
59
+ excluding communication that is conspicuously marked or otherwise
60
+ designated in writing by the copyright owner as "Not a Contribution."
61
+
62
+ "Contributor" shall mean Licensor and any individual or Legal Entity
63
+ on behalf of whom a Contribution has been received by Licensor and
64
+ subsequently incorporated within the Work.
65
+
66
+ 2. Grant of Copyright License. Subject to the terms and conditions of
67
+ this License, each Contributor hereby grants to You a perpetual,
68
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69
+ copyright license to reproduce, prepare Derivative Works of,
70
+ publicly display, publicly perform, sublicense, and distribute the
71
+ Work and such Derivative Works in Source or Object form.
72
+
73
+ 3. Grant of Patent License. Subject to the terms and conditions of
74
+ this License, each Contributor hereby grants to You a perpetual,
75
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76
+ (except as stated in this section) patent license to make, have made,
77
+ use, offer to sell, sell, import, and otherwise transfer the Work,
78
+ where such license applies only to those patent claims licensable
79
+ by such Contributor that are necessarily infringed by their
80
+ Contribution(s) alone or by combination of their Contribution(s)
81
+ with the Work to which such Contribution(s) was submitted. If You
82
+ institute patent litigation against any entity (including a
83
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
84
+ or a Contribution incorporated within the Work constitutes direct
85
+ or contributory patent infringement, then any patent licenses
86
+ granted to You under this License for that Work shall terminate
87
+ as of the date such litigation is filed.
88
+
89
+ 4. Redistribution. You may reproduce and distribute copies of the
90
+ Work or Derivative Works thereof in any medium, with or without
91
+ modifications, and in Source or Object form, provided that You
92
+ meet the following conditions:
93
+
94
+ (a) You must give any other recipients of the Work or
95
+ Derivative Works a copy of this License; and
96
+
97
+ (b) You must cause any modified files to carry prominent notices
98
+ stating that You changed the files; and
99
+
100
+ (c) You must retain, in the Source form of any Derivative Works
101
+ that You distribute, all copyright, patent, trademark, and
102
+ attribution notices from the Source form of the Work,
103
+ excluding those notices that do not pertain to any part of
104
+ the Derivative Works; and
105
+
106
+ (d) If the Work includes a "NOTICE" text file as part of its
107
+ distribution, then any Derivative Works that You distribute must
108
+ include a readable copy of the attribution notices contained
109
+ within such NOTICE file, excluding those notices that do not
110
+ pertain to any part of the Derivative Works, in at least one
111
+ of the following places: within a NOTICE text file distributed
112
+ as part of the Derivative Works; within the Source form or
113
+ documentation, if provided along with the Derivative Works; or,
114
+ within a display generated by the Derivative Works, if and
115
+ wherever such third-party notices normally appear. The contents
116
+ of the NOTICE file are for informational purposes only and
117
+ do not modify the License. You may add Your own attribution
118
+ notices within Derivative Works that You distribute, alongside
119
+ or as an addendum to the NOTICE text from the Work, provided
120
+ that such additional attribution notices cannot be construed
121
+ as modifying the License.
122
+
123
+ You may add Your own copyright statement to Your modifications and
124
+ may provide additional or different license terms and conditions
125
+ for use, reproduction, or distribution of Your modifications, or
126
+ for any such Derivative Works as a whole, provided Your use,
127
+ reproduction, and distribution of the Work otherwise complies with
128
+ the conditions stated in this License.
129
+
130
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
131
+ any Contribution intentionally submitted for inclusion in the Work
132
+ by You to the Licensor shall be under the terms and conditions of
133
+ this License, without any additional terms or conditions.
134
+ Notwithstanding the above, nothing herein shall supersede or modify
135
+ the terms of any separate license agreement you may have executed
136
+ with Licensor regarding such Contributions.
137
+
138
+ 6. Trademarks. This License does not grant permission to use the trade
139
+ names, trademarks, service marks, or product names of the Licensor,
140
+ except as required for reasonable and customary use in describing the
141
+ origin of the Work and reproducing the content of the NOTICE file.
142
+
143
+ 7. Disclaimer of Warranty. Unless required by applicable law or
144
+ agreed to in writing, Licensor provides the Work (and each
145
+ Contributor provides its Contributions) on an "AS IS" BASIS,
146
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147
+ implied, including, without limitation, any warranties or conditions
148
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149
+ PARTICULAR PURPOSE. You are solely responsible for determining the
150
+ appropriateness of using or redistributing the Work and assume any
151
+ risks associated with Your exercise of permissions under this License.
152
+
153
+ 8. Limitation of Liability. In no event and under no legal theory,
154
+ whether in tort (including negligence), contract, or otherwise,
155
+ unless required by applicable law (such as deliberate and grossly
156
+ negligent acts) or agreed to in writing, shall any Contributor be
157
+ liable to You for damages, including any direct, indirect, special,
158
+ incidental, or consequential damages of any character arising as a
159
+ result of this License or out of the use or inability to use the
160
+ Work (including but not limited to damages for loss of goodwill,
161
+ work stoppage, computer failure or malfunction, or any and all
162
+ other commercial damages or losses), even if such Contributor
163
+ has been advised of the possibility of such damages.
164
+
165
+ 9. Accepting Warranty or Additional Liability. While redistributing
166
+ the Work or Derivative Works thereof, You may choose to offer,
167
+ and charge a fee for, acceptance of support, warranty, indemnity,
168
+ or other liability obligations and/or rights consistent with this
169
+ License. However, in accepting such obligations, You may act only
170
+ on Your own behalf and on Your sole responsibility, not on behalf
171
+ of any other Contributor, and only if You agree to indemnify,
172
+ defend, and hold each Contributor harmless for any liability
173
+ incurred by, or claims asserted against, such Contributor by reason
174
+ of your accepting any such warranty or additional liability.
175
+
176
+ END OF TERMS AND CONDITIONS
177
+
178
+ APPENDIX: How to apply the Apache License to your work.
179
+
180
+ To apply the Apache License to your work, attach the following
181
+ boilerplate notice, with the fields enclosed by brackets "[]"
182
+ replaced with your own identifying information. (Don't include
183
+ the brackets!) The text should be enclosed in the appropriate
184
+ comment syntax for the file format. We also recommend that a
185
+ file or class name and description of purpose be included on the
186
+ same "printed page" as the copyright notice for easier
187
+ identification within third-party archives.
188
+
189
+ Copyright [yyyy] [name of copyright owner]
190
+
191
+ Licensed under the Apache License, Version 2.0 (the "License");
192
+ you may not use this file except in compliance with the License.
193
+ You may obtain a copy of the License at
194
+
195
+ http://www.apache.org/licenses/LICENSE-2.0
196
+
197
+ Unless required by applicable law or agreed to in writing, software
198
+ distributed under the License is distributed on an "AS IS" BASIS,
199
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200
+ See the License for the specific language governing permissions and
201
+ limitations under the License.
LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/safetensors/__init__.py ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ # Re-export this
2
+ from ._safetensors_rust import ( # noqa: F401
3
+ SafetensorError,
4
+ __version__,
5
+ deserialize,
6
+ safe_open,
7
+ _safe_open_handle,
8
+ serialize,
9
+ serialize_file,
10
+ )
LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/safetensors/__init__.pyi ADDED
@@ -0,0 +1,164 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Generated content DO NOT EDIT
2
+ @staticmethod
3
+ def deserialize(bytes):
4
+ """
5
+ Opens a safetensors lazily and returns tensors as asked
6
+
7
+ Args:
8
+ data (`bytes`):
9
+ The byte content of a file
10
+
11
+ Returns:
12
+ (`List[str, Dict[str, Dict[str, any]]]`):
13
+ The deserialized content is like:
14
+ [("tensor_name", {"shape": [2, 3], "dtype": "F32", "data": b"\0\0.." }), (...)]
15
+ """
16
+ pass
17
+
18
+ @staticmethod
19
+ def serialize(tensor_dict, metadata=None):
20
+ """
21
+ Serializes raw data.
22
+
23
+ Args:
24
+ tensor_dict (`Dict[str, Dict[Any]]`):
25
+ The tensor dict is like:
26
+ {"tensor_name": {"dtype": "F32", "shape": [2, 3], "data": b"\0\0"}}
27
+ metadata (`Dict[str, str]`, *optional*):
28
+ The optional purely text annotations
29
+
30
+ Returns:
31
+ (`bytes`):
32
+ The serialized content.
33
+ """
34
+ pass
35
+
36
+ @staticmethod
37
+ def serialize_file(tensor_dict, filename, metadata=None):
38
+ """
39
+ Serializes raw data into file.
40
+
41
+ Args:
42
+ tensor_dict (`Dict[str, Dict[Any]]`):
43
+ The tensor dict is like:
44
+ {"tensor_name": {"dtype": "F32", "shape": [2, 3], "data": b"\0\0"}}
45
+ filename (`str`, or `os.PathLike`):
46
+ The name of the file to write into.
47
+ metadata (`Dict[str, str]`, *optional*):
48
+ The optional purely text annotations
49
+
50
+ Returns:
51
+ (`NoneType`):
52
+ On success return None
53
+ """
54
+ pass
55
+
56
+ class safe_open:
57
+ """
58
+ Opens a safetensors lazily and returns tensors as asked
59
+
60
+ Args:
61
+ filename (`str`, or `os.PathLike`):
62
+ The filename to open
63
+
64
+ framework (`str`):
65
+ The framework you want you tensors in. Supported values:
66
+ `pt`, `tf`, `flax`, `numpy`.
67
+
68
+ device (`str`, defaults to `"cpu"`):
69
+ The device on which you want the tensors.
70
+ """
71
+ def __init__(self, filename, framework, device=...):
72
+ pass
73
+
74
+ def __enter__(self):
75
+ """
76
+ Start the context manager
77
+ """
78
+ pass
79
+
80
+ def __exit__(self, _exc_type, _exc_value, _traceback):
81
+ """
82
+ Exits the context manager
83
+ """
84
+ pass
85
+
86
+ def get_slice(self, name):
87
+ """
88
+ Returns a full slice view object
89
+
90
+ Args:
91
+ name (`str`):
92
+ The name of the tensor you want
93
+
94
+ Returns:
95
+ (`PySafeSlice`):
96
+ A dummy object you can slice into to get a real tensor
97
+ Example:
98
+ ```python
99
+ from safetensors import safe_open
100
+
101
+ with safe_open("model.safetensors", framework="pt", device=0) as f:
102
+ tensor_part = f.get_slice("embedding")[:, ::8]
103
+
104
+ ```
105
+ """
106
+ pass
107
+
108
+ def get_tensor(self, name):
109
+ """
110
+ Returns a full tensor
111
+
112
+ Args:
113
+ name (`str`):
114
+ The name of the tensor you want
115
+
116
+ Returns:
117
+ (`Tensor`):
118
+ The tensor in the framework you opened the file for.
119
+
120
+ Example:
121
+ ```python
122
+ from safetensors import safe_open
123
+
124
+ with safe_open("model.safetensors", framework="pt", device=0) as f:
125
+ tensor = f.get_tensor("embedding")
126
+
127
+ ```
128
+ """
129
+ pass
130
+
131
+ def keys(self):
132
+ """
133
+ Returns the names of the tensors in the file.
134
+
135
+ Returns:
136
+ (`List[str]`):
137
+ The name of the tensors contained in that file
138
+ """
139
+ pass
140
+
141
+ def metadata(self):
142
+ """
143
+ Return the special non tensor information in the header
144
+
145
+ Returns:
146
+ (`Dict[str, str]`):
147
+ The freeform metadata.
148
+ """
149
+ pass
150
+
151
+ def offset_keys(self):
152
+ """
153
+ Returns the names of the tensors in the file, ordered by offset.
154
+
155
+ Returns:
156
+ (`List[str]`):
157
+ The name of the tensors contained in that file
158
+ """
159
+ pass
160
+
161
+ class SafetensorError(Exception):
162
+ """
163
+ Custom Python Exception for Safetensor errors.
164
+ """
LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/safetensors/paddle.py ADDED
@@ -0,0 +1,290 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import sys
3
+ from typing import Any, Dict, Optional, Union
4
+
5
+ import numpy as np
6
+ import paddle
7
+
8
+ from safetensors import numpy, deserialize, safe_open, serialize, serialize_file
9
+
10
+
11
+ def save(
12
+ tensors: Dict[str, paddle.Tensor], metadata: Optional[Dict[str, str]] = None
13
+ ) -> bytes:
14
+ """
15
+ Saves a dictionary of tensors into raw bytes in safetensors format.
16
+
17
+ Args:
18
+ tensors (`Dict[str, paddle.Tensor]`):
19
+ The incoming tensors. Tensors need to be contiguous and dense.
20
+ metadata (`Dict[str, str]`, *optional*, defaults to `None`):
21
+ Optional text only metadata you might want to save in your header.
22
+ For instance it can be useful to specify more about the underlying
23
+ tensors. This is purely informative and does not affect tensor loading.
24
+
25
+ Returns:
26
+ `bytes`: The raw bytes representing the format
27
+
28
+ Example:
29
+
30
+ ```python
31
+ from safetensors.paddle import save
32
+ import paddle
33
+
34
+ tensors = {"embedding": paddle.zeros((512, 1024)), "attention": paddle.zeros((256, 256))}
35
+ byte_data = save(tensors)
36
+ ```
37
+ """
38
+ serialized = serialize(_flatten(tensors), metadata=metadata)
39
+ result = bytes(serialized)
40
+ return result
41
+
42
+
43
+ def save_file(
44
+ tensors: Dict[str, paddle.Tensor],
45
+ filename: Union[str, os.PathLike],
46
+ metadata: Optional[Dict[str, str]] = None,
47
+ ) -> None:
48
+ """
49
+ Saves a dictionary of tensors into raw bytes in safetensors format.
50
+
51
+ Args:
52
+ tensors (`Dict[str, paddle.Tensor]`):
53
+ The incoming tensors. Tensors need to be contiguous and dense.
54
+ filename (`str`, or `os.PathLike`)):
55
+ The filename we're saving into.
56
+ metadata (`Dict[str, str]`, *optional*, defaults to `None`):
57
+ Optional text only metadata you might want to save in your header.
58
+ For instance it can be useful to specify more about the underlying
59
+ tensors. This is purely informative and does not affect tensor loading.
60
+
61
+ Returns:
62
+ `None`
63
+
64
+ Example:
65
+
66
+ ```python
67
+ from safetensors.paddle import save_file
68
+ import paddle
69
+
70
+ tensors = {"embedding": paddle.zeros((512, 1024)), "attention": paddle.zeros((256, 256))}
71
+ save_file(tensors, "model.safetensors")
72
+ ```
73
+ """
74
+ serialize_file(_flatten(tensors), filename, metadata=metadata)
75
+
76
+
77
+ def load(data: bytes, device: str = "cpu") -> Dict[str, paddle.Tensor]:
78
+ """
79
+ Loads a safetensors file into paddle format from pure bytes.
80
+
81
+ Args:
82
+ data (`bytes`):
83
+ The content of a safetensors file
84
+
85
+ Returns:
86
+ `Dict[str, paddle.Tensor]`: dictionary that contains name as key, value as `paddle.Tensor` on cpu
87
+
88
+ Example:
89
+
90
+ ```python
91
+ from safetensors.paddle import load
92
+
93
+ file_path = "./my_folder/bert.safetensors"
94
+ with open(file_path, "rb") as f:
95
+ data = f.read()
96
+
97
+ loaded = load(data)
98
+ ```
99
+ """
100
+ if paddle.__version__ >= "3.2.0":
101
+ flat = deserialize(data)
102
+ return _view2paddle(flat, device)
103
+ else:
104
+ flat = numpy.load(data)
105
+ return _np2paddle(flat, device)
106
+
107
+
108
+ def load_file(
109
+ filename: Union[str, os.PathLike], device="cpu"
110
+ ) -> Dict[str, paddle.Tensor]:
111
+ """
112
+ Loads a safetensors file into paddle format.
113
+
114
+ Args:
115
+ filename (`str`, or `os.PathLike`)):
116
+ The name of the file which contains the tensors
117
+ device (`Union[Dict[str, any], str]`, *optional*, defaults to `cpu`):
118
+ The device where the tensors need to be located after load.
119
+ available options are all regular paddle device locations
120
+
121
+ Returns:
122
+ `Dict[str, paddle.Tensor]`: dictionary that contains name as key, value as `paddle.Tensor`
123
+
124
+ Example:
125
+
126
+ ```python
127
+ from safetensors.paddle import load_file
128
+
129
+ file_path = "./my_folder/bert.safetensors"
130
+ loaded = load_file(file_path)
131
+ ```
132
+ """
133
+ result = {}
134
+ if paddle.__version__ >= "3.2.0":
135
+ with safe_open(filename, framework="paddle", device=device) as f:
136
+ for k in f.offset_keys():
137
+ result[k] = f.get_tensor(k)
138
+ else:
139
+ flat = numpy.load_file(filename)
140
+ result = _np2paddle(flat, device)
141
+ return result
142
+
143
+
144
+ def _np2paddle(
145
+ numpy_dict: Dict[str, np.ndarray], device: str = "cpu"
146
+ ) -> Dict[str, paddle.Tensor]:
147
+ for k, v in numpy_dict.items():
148
+ numpy_dict[k] = paddle.to_tensor(v, place=device)
149
+ return numpy_dict
150
+
151
+
152
+ def _paddle2np(paddle_dict: Dict[str, paddle.Tensor]) -> Dict[str, np.array]:
153
+ for k, v in paddle_dict.items():
154
+ paddle_dict[k] = v.detach().cpu().numpy()
155
+ return paddle_dict
156
+
157
+
158
+ _SIZE = {
159
+ paddle.int64: 8,
160
+ paddle.float32: 4,
161
+ paddle.int32: 4,
162
+ paddle.bfloat16: 2,
163
+ paddle.float16: 2,
164
+ paddle.int16: 2,
165
+ paddle.uint8: 1,
166
+ paddle.int8: 1,
167
+ paddle.bool: 1,
168
+ paddle.float64: 8,
169
+ paddle.float8_e4m3fn: 1,
170
+ paddle.float8_e5m2: 1,
171
+ paddle.complex64: 8,
172
+ # XXX: These are not supported yet in paddle
173
+ # paddle.uint64: 8,
174
+ # paddle.uint32: 4,
175
+ # paddle.uint16: 2,
176
+ # paddle.float8_e8m0: 1,
177
+ # paddle.float4_e2m1_x2: 1,
178
+ }
179
+
180
+ _TYPES = {
181
+ "F64": paddle.float64,
182
+ "F32": paddle.float32,
183
+ "F16": paddle.float16,
184
+ "BF16": paddle.bfloat16,
185
+ "I64": paddle.int64,
186
+ "I32": paddle.int32,
187
+ "I16": paddle.int16,
188
+ "I8": paddle.int8,
189
+ "U8": paddle.uint8,
190
+ "BOOL": paddle.bool,
191
+ "F8_E4M3": paddle.float8_e4m3fn,
192
+ "F8_E5M2": paddle.float8_e5m2,
193
+ }
194
+
195
+ NPDTYPES = {
196
+ paddle.int64: np.int64,
197
+ paddle.float32: np.float32,
198
+ paddle.int32: np.int32,
199
+ # XXX: This is ok because both have the same width
200
+ paddle.bfloat16: np.float16,
201
+ paddle.float16: np.float16,
202
+ paddle.int16: np.int16,
203
+ paddle.uint8: np.uint8,
204
+ paddle.int8: np.int8,
205
+ paddle.bool: bool,
206
+ paddle.float64: np.float64,
207
+ # XXX: This is ok because both have the same width and byteswap is a no-op anyway
208
+ paddle.float8_e4m3fn: np.uint8,
209
+ paddle.float8_e5m2: np.uint8,
210
+ }
211
+
212
+
213
+ def _getdtype(dtype_str: str) -> paddle.dtype:
214
+ return _TYPES[dtype_str]
215
+
216
+
217
+ def _view2paddle(safeview, device) -> Dict[str, paddle.Tensor]:
218
+ result = {}
219
+ for k, v in safeview:
220
+ dtype = _getdtype(v["dtype"])
221
+ if len(v["data"]) == 0:
222
+ # Workaround because frombuffer doesn't accept zero-size tensors
223
+ assert any(x == 0 for x in v["shape"])
224
+ arr = paddle.empty(v["shape"], dtype=dtype)
225
+ else:
226
+ arr = paddle.base.core.frombuffer(v["data"], dtype).reshape(v["shape"])
227
+ if device != "cpu":
228
+ arr = arr.to(device)
229
+ if sys.byteorder == "big":
230
+ arr = paddle.to_tensor(arr.numpy().byteswap(inplace=False), place=device)
231
+ result[k] = arr
232
+
233
+ return result
234
+
235
+
236
+ def _tobytes(tensor: paddle.Tensor, name: str) -> bytes:
237
+ if not tensor.is_contiguous():
238
+ raise ValueError(
239
+ f"You are trying to save a non contiguous tensor: `{name}` which is not allowed. It either means you"
240
+ " are trying to save tensors which are reference of each other in which case it's recommended to save"
241
+ " only the full tensors, and reslice at load time, or simply call `.contiguous()` on your tensor to"
242
+ " pack it before saving."
243
+ )
244
+ if not tensor.place.is_cpu_place():
245
+ # Moving tensor to cpu before saving
246
+ tensor = tensor.cpu()
247
+
248
+ import ctypes
249
+
250
+ import numpy as np
251
+
252
+ # When shape is empty (scalar), np.prod returns a float
253
+ # we need a int for the following calculations
254
+ length = int(np.prod(tensor.shape).item())
255
+ bytes_per_item = _SIZE[tensor.dtype]
256
+
257
+ total_bytes = length * bytes_per_item
258
+
259
+ ptr = tensor.data_ptr()
260
+ if ptr == 0:
261
+ return b""
262
+ newptr = ctypes.cast(ptr, ctypes.POINTER(ctypes.c_ubyte))
263
+ data = np.ctypeslib.as_array(newptr, (total_bytes,)) # no internal copy
264
+ if sys.byteorder == "big":
265
+ npdtype = NPDTYPES[tensor.dtype]
266
+ # Not in place as that would potentially modify a live running model
267
+ data = data.view(npdtype).byteswap(inplace=False)
268
+ return data.tobytes()
269
+
270
+
271
+ def _flatten(tensors: Dict[str, paddle.Tensor]) -> Dict[str, Dict[str, Any]]:
272
+ if not isinstance(tensors, dict):
273
+ raise ValueError(
274
+ f"Expected a dict of [str, paddle.Tensor] but received {type(tensors)}"
275
+ )
276
+
277
+ for k, v in tensors.items():
278
+ if not isinstance(v, paddle.Tensor):
279
+ raise ValueError(
280
+ f"Key `{k}` is invalid, expected paddle.Tensor but received {type(v)}"
281
+ )
282
+
283
+ return {
284
+ k: {
285
+ "dtype": str(v.dtype).split(".")[-1],
286
+ "shape": v.shape,
287
+ "data": _tobytes(v, k),
288
+ }
289
+ for k, v in tensors.items()
290
+ }
LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/safetensors/py.typed ADDED
File without changes
LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/transformers/models/code_llama/__init__.py ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright 2024 The HuggingFace Team. All rights reserved.
2
+ #
3
+ # Licensed under the Apache License, Version 2.0 (the "License");
4
+ # you may not use this file except in compliance with the License.
5
+ # You may obtain a copy of the License at
6
+ #
7
+ # http://www.apache.org/licenses/LICENSE-2.0
8
+ #
9
+ # Unless required by applicable law or agreed to in writing, software
10
+ # distributed under the License is distributed on an "AS IS" BASIS,
11
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ # See the License for the specific language governing permissions and
13
+ # limitations under the License.
14
+ from typing import TYPE_CHECKING
15
+
16
+ from ...utils import _LazyModule
17
+ from ...utils.import_utils import define_import_structure
18
+
19
+
20
+ if TYPE_CHECKING:
21
+ from .tokenization_code_llama import *
22
+ else:
23
+ import sys
24
+
25
+ _file = globals()["__file__"]
26
+ sys.modules[__name__] = _LazyModule(__name__, _file, define_import_structure(_file), module_spec=__spec__)
LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/transformers/models/code_llama/tokenization_code_llama.py ADDED
@@ -0,0 +1,358 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright 2023 The HuggingFace Inc. team.
2
+ #
3
+ # Licensed under the Apache License, Version 2.0 (the "License");
4
+ # you may not use this file except in compliance with the License.
5
+ # You may obtain a copy of the License at
6
+ #
7
+ # http://www.apache.org/licenses/LICENSE-2.0
8
+ #
9
+ # Unless required by applicable law or agreed to in writing, software
10
+ # distributed under the License is distributed on an "AS IS" BASIS,
11
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ # See the License for the specific language governing permissions and
13
+ # limitations under the License.
14
+
15
+
16
+ from tokenizers import Tokenizer, decoders, normalizers, pre_tokenizers, processors
17
+ from tokenizers.models import BPE
18
+
19
+ from ...tokenization_utils_tokenizers import TokenizersBackend
20
+ from ...utils import logging
21
+
22
+
23
+ logger = logging.get_logger(__name__)
24
+ VOCAB_FILES_NAMES = {"vocab_file": "tokenizer.model", "tokenizer_file": "tokenizer.json"}
25
+
26
+ SPIECE_UNDERLINE = "▁"
27
+
28
+ B_INST, E_INST = "[INST]", "[/INST]"
29
+ B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"
30
+
31
+ # fmt: off
32
+ DEFAULT_SYSTEM_PROMPT = """You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your \
33
+ answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure\
34
+ that your responses are socially unbiased and positive in nature.
35
+
36
+ If a question does not make any sense, or is not factually coherent, explain why instead of answering something not \
37
+ correct. If you don't know the answer to a question, please don't share false information."""
38
+ # fmt: on
39
+
40
+
41
+ class CodeLlamaTokenizer(TokenizersBackend):
42
+ """
43
+ Construct a Llama tokenizer. Based on byte-level Byte-Pair-Encoding.
44
+
45
+ This uses notably ByteFallback and no normalization.
46
+
47
+ ```python
48
+ >>> from transformers import CodeLlamaTokenizer
49
+
50
+ >>> tokenizer = CodeLlamaTokenizer.from_pretrained("hf-internal-testing/llama-tokenizer")
51
+ >>> tokenizer.encode("Hello this is a test")
52
+ [1, 15043, 445, 338, 263, 1243]
53
+ ```
54
+
55
+ If you want to change the `bos_token` or the `eos_token`, make sure to specify them when initializing the model, or
56
+ call `tokenizer.update_post_processor()` to make sure that the post-processing is correctly done (otherwise the
57
+ values of the first token and final token of an encoded sequence will not be correct). For more details, checkout
58
+ [post-processors] (https://huggingface.co/docs/tokenizers/api/post-processors) documentation.
59
+
60
+
61
+ This tokenizer inherits from [`PreTrainedTokenizerFast`] which contains most of the main methods. Users should
62
+ refer to this superclass for more information regarding those methods. The default configuration match that of
63
+ [meta-llama/CodeLlama-7b-Instruct-hf](https://huggingface.co/meta-llama/CodeLlama-7b-Instruct-hf/blob/main/tokenizer_config.json)
64
+ which supports prompt infilling.
65
+
66
+ Args:
67
+ clean_up_tokenization_spaces (`str`, *optional*, defaults to `False`):
68
+ Whether to cleanup spaces after decoding, cleanup consists in removing potential artifacts like extra
69
+ spaces.
70
+ unk_token (`str`, *optional*, defaults to `"<unk>"`):
71
+ The unknown token. A token that is not in the vocabulary cannot be converted to an ID and is set to be this
72
+ token instead.
73
+ bos_token (`str`, *optional*, defaults to `"<s>"`):
74
+ The beginning of sequence token that was used during pretraining. Can be used a sequence classifier token.
75
+ eos_token (`str`, *optional*, defaults to `"</s>"`):
76
+ The end of sequence token.
77
+ prefix_token (`str`, *optional*, defaults to `"▁<PRE>"`):
78
+ Prefix token used for infilling.
79
+ middle_token (`str`, *optional*, defaults to `"▁<MID>"`):
80
+ Middle token used for infilling.
81
+ suffix_token (`str`, *optional*, defaults to `"▁<SUF>"`):
82
+ Suffix token used for infilling.
83
+ eot_token (`str`, *optional*, defaults to `"▁<EOT>"`):
84
+ End of text token used for infilling.
85
+ fill_token (`str`, *optional*, defaults to `"<FILL_ME>"`):
86
+ The token used to split the input between the prefix and suffix.
87
+ additional_special_tokens (`list[str]`, *optional*):
88
+ Additional special tokens used by the tokenizer.
89
+ add_bos_token (`bool`, *optional*, defaults to `True`):
90
+ Whether to add a beginning of sequence token at the start of sequences.
91
+ add_eos_token (`bool`, *optional*, defaults to `False`):
92
+ Whether to add an end of sequence token at the end of sequences.
93
+ use_default_system_prompt (`bool`, *optional*, defaults to `False`):
94
+ Whether or not the default system prompt for Llama should be used.
95
+ add_prefix_space (`bool`, *optional*):
96
+ Whether or not to add an initial space to the input. This allows to treat the leading word just as any
97
+ other word.
98
+ vocab (`str`, `dict` or `list`, *optional*):
99
+ Custom vocabulary dictionary. If not provided, vocabulary is loaded from vocab_file.
100
+ merges (`str` or `list`, *optional*):
101
+ Custom merges list. If not provided, merges are loaded from merges_file.
102
+ vocab_file (`str`, *optional*):
103
+ [SentencePiece](https://github.com/google/sentencepiece) file (generally has a .model extension) that
104
+ contains the vocabulary necessary to instantiate a tokenizer.
105
+ """
106
+
107
+ vocab_files_names = VOCAB_FILES_NAMES
108
+ padding_side = "left"
109
+ model_input_names = ["input_ids", "attention_mask"]
110
+ model = BPE
111
+
112
+ def __init__(
113
+ self,
114
+ vocab: str | dict[str, int] | None = None,
115
+ merges: str | list[str] | None = None,
116
+ clean_up_tokenization_spaces=False,
117
+ unk_token="<unk>",
118
+ bos_token="<s>",
119
+ eos_token="</s>",
120
+ prefix_token="▁<PRE>",
121
+ middle_token="▁<MID>",
122
+ suffix_token="▁<SUF>",
123
+ eot_token="▁<EOT>",
124
+ fill_token="<FILL_ME>",
125
+ additional_special_tokens=None,
126
+ use_default_system_prompt: bool = False,
127
+ add_prefix_space: bool | None = True,
128
+ add_bos_token: bool = True,
129
+ **kwargs,
130
+ ):
131
+ self.add_prefix_space = add_prefix_space if add_prefix_space is not None else True
132
+ self.use_default_system_prompt = use_default_system_prompt
133
+ additional_special_tokens = additional_special_tokens or []
134
+ for token in [prefix_token, middle_token, suffix_token, eot_token, fill_token]:
135
+ additional_special_tokens += [token] if token is not None else []
136
+
137
+ self._vocab = (
138
+ vocab
139
+ if vocab is not None
140
+ else {
141
+ str(unk_token): 0,
142
+ str(bos_token): 1,
143
+ str(eos_token): 2,
144
+ }
145
+ )
146
+
147
+ self._merges = merges or []
148
+ self._tokenizer = Tokenizer(
149
+ BPE(
150
+ vocab=self._vocab,
151
+ merges=self._merges,
152
+ fuse_unk=True,
153
+ byte_fallback=True,
154
+ dropout=None,
155
+ unk_token=str(unk_token),
156
+ )
157
+ )
158
+ prepend_scheme = "first" if self.add_prefix_space else "never"
159
+ self._tokenizer.pre_tokenizer = pre_tokenizers.Metaspace(
160
+ replacement="▁", prepend_scheme=prepend_scheme, split=False
161
+ )
162
+
163
+ self._tokenizer.decoder = decoders.Sequence(
164
+ [decoders.Replace("▁", " "), decoders.ByteFallback(), decoders.Fuse(), decoders.Strip(content=" ", left=1)]
165
+ )
166
+
167
+ super().__init__(
168
+ clean_up_tokenization_spaces=clean_up_tokenization_spaces,
169
+ unk_token=unk_token,
170
+ bos_token=bos_token,
171
+ eos_token=eos_token,
172
+ use_default_system_prompt=use_default_system_prompt,
173
+ add_prefix_space=add_prefix_space,
174
+ prefix_token=prefix_token,
175
+ middle_token=middle_token,
176
+ suffix_token=suffix_token,
177
+ eot_token=eot_token,
178
+ fill_token=fill_token,
179
+ add_bos_token=add_bos_token,
180
+ additional_special_tokens=additional_special_tokens,
181
+ **kwargs,
182
+ )
183
+ self._prefix_token = prefix_token
184
+ self._middle_token = middle_token
185
+ self._suffix_token = suffix_token
186
+ self._eot_token = eot_token
187
+ self.fill_token = fill_token
188
+
189
+ @property
190
+ def prefix_token(self):
191
+ return self._prefix_token
192
+
193
+ @property
194
+ def prefix_id(self):
195
+ if self._prefix_token is None:
196
+ return None
197
+ return self.convert_tokens_to_ids(self.prefix_token)
198
+
199
+ @property
200
+ def middle_token(self):
201
+ return self._middle_token
202
+
203
+ @property
204
+ def middle_id(self):
205
+ if self._middle_token is None:
206
+ return None
207
+ return self.convert_tokens_to_ids(self.middle_token)
208
+
209
+ @property
210
+ def suffix_token(self):
211
+ return self._suffix_token
212
+
213
+ @property
214
+ def suffix_id(self):
215
+ if self._suffix_token is None:
216
+ return None
217
+ return self.convert_tokens_to_ids(self.suffix_token)
218
+
219
+ @property
220
+ def eot_id(self):
221
+ if self._eot_token is None:
222
+ return None
223
+ return self.convert_tokens_to_ids(self.eot_token)
224
+
225
+ @property
226
+ def eot_token(self):
227
+ return self._eot_token
228
+
229
+ def set_infilling_processor(self, reset, suffix_first=False, add_special_tokens=True):
230
+ """
231
+ Updates the normalizer to make sure the prompt format for `infilling` is respected. The infilling format is the
232
+ following: if suffix_first
233
+ " <PRE> <SUF>{suf} <MID> {pre}"
234
+ else:
235
+ " <PRE> {pre} <SUF>{suf} <MID>"
236
+
237
+ If `reset` is set to `True`, the `normalizer` and `post_processor` are reset to their "normal" behaviour, which
238
+ is to add a prefix space for the normalizer, and add a `bos_token` to the input text for the `post_processor`.
239
+ """
240
+ if reset:
241
+ self._tokenizer.normalizer = normalizers.Sequence(
242
+ [
243
+ normalizers.Prepend(prepend="▁"),
244
+ normalizers.Replace(pattern=" ", content="▁"),
245
+ ]
246
+ )
247
+ self.update_post_processor()
248
+ return
249
+
250
+ self._tokenizer.normalizer = normalizers.Replace(pattern=" ", content="▁")
251
+ pair = [self.bos_token] if self.add_bos_token and add_special_tokens else []
252
+ special_tokens = [(self.bos_token, self.bos_token_id)] if self.add_bos_token and add_special_tokens else []
253
+ if suffix_first:
254
+ # format as " <PRE> <SUF>{suf} <MID> {pre}"
255
+ pair += [self.prefix_token, self.suffix_token, "$B", self.middle_token, "$A"]
256
+ special_tokens += [
257
+ (self.prefix_token, self.prefix_id),
258
+ (self.suffix_token, self.suffix_id),
259
+ (self.middle_token, self.middle_id),
260
+ ]
261
+ else:
262
+ # format as " <PRE> {pre} <SUF>{suf} <MID>"
263
+ pair += [self.prefix_token, "$A", self.suffix_token, "$B", self.middle_token]
264
+ special_tokens += [
265
+ (self.prefix_token, self.prefix_id),
266
+ (self.suffix_token, self.suffix_id),
267
+ (self.middle_token, self.middle_id),
268
+ ]
269
+
270
+ if self.add_eos_token and add_special_tokens:
271
+ pair += [self.eos_token]
272
+ special_tokens += [(self.eos_token, self.eos_token_id)]
273
+ self._tokenizer.post_processor = processors.TemplateProcessing(
274
+ single="$A", pair=pair, special_tokens=special_tokens
275
+ )
276
+
277
+ def tokenize(self, text, suffix=None, suffix_first=False, **kwargs):
278
+ # Handle fill_token splitting
279
+ if self.fill_token is not None and self.fill_token in text and suffix is None:
280
+ text, suffix = text.split(self.fill_token)
281
+
282
+ # If no suffix, use standard tokenization
283
+ if suffix is None or len(suffix) < 1:
284
+ return super().tokenize(text, **kwargs)
285
+
286
+ # Check that infilling tokens are available
287
+ if None in (self.prefix_id, self.middle_id, self.suffix_id):
288
+ raise ValueError(
289
+ "The input either includes a `prefix` and a `suffix` used for the infilling task,"
290
+ f" or can be split on the {self.fill_token} token, creating a suffix and prefix,"
291
+ " but the model does not support `infilling`."
292
+ )
293
+
294
+ # Temporarily set infilling processor
295
+ self.set_infilling_processor(False, suffix_first=suffix_first, add_special_tokens=False)
296
+
297
+ # Remove text_pair and pair from kwargs if present to avoid conflict
298
+ kwargs.pop("text_pair", None)
299
+ kwargs.pop("pair", None)
300
+
301
+ # Tokenize with infilling format
302
+ # The processor will handle the special token arrangement
303
+ # Use pair=suffix (not text_pair) since base class tokenize expects 'pair' parameter
304
+ result = super().tokenize(" " + text, pair=suffix, **kwargs)
305
+
306
+ # Reset processor
307
+ self.set_infilling_processor(True)
308
+
309
+ return result
310
+
311
+ def _encode_plus(self, text, text_pair=None, suffix=None, suffix_first=False, add_special_tokens=True, **kwargs):
312
+ is_infilling = False
313
+
314
+ if suffix is not None:
315
+ text_pair = suffix
316
+ is_infilling = True
317
+ elif "suffix" in kwargs:
318
+ text_pair = kwargs.pop("suffix")
319
+ is_infilling = True
320
+
321
+ if isinstance(text, str) and self.fill_token is not None and self.fill_token in text and text_pair is None:
322
+ text, text_pair = text.split(self.fill_token)
323
+ is_infilling = True
324
+
325
+ if not is_infilling:
326
+ return super()._encode_plus(text, text_pair=text_pair, add_special_tokens=add_special_tokens, **kwargs)
327
+
328
+ if (
329
+ text_pair is None
330
+ or (isinstance(text_pair, str) and len(text_pair) < 1)
331
+ or (isinstance(text_pair, list) and len(text_pair) == 0)
332
+ ):
333
+ return super()._encode_plus(text, text_pair=text_pair, add_special_tokens=add_special_tokens, **kwargs)
334
+
335
+ if None in (self.prefix_id, self.middle_id, self.suffix_id):
336
+ raise ValueError(
337
+ "The input includes a `prefix` and a `suffix` used for the infilling task,"
338
+ " the `prefix_id, middle_id, suffix_id` must all be initialized. Current"
339
+ f" values : {self.prefix_id, self.middle_id, self.suffix_id}"
340
+ )
341
+
342
+ self.set_infilling_processor(False, suffix_first=suffix_first, add_special_tokens=add_special_tokens)
343
+ kwargs.pop("text_pair", None)
344
+
345
+ if isinstance(text, str):
346
+ text = " " + text
347
+ elif isinstance(text, list):
348
+ text = [" " + t if isinstance(t, str) else t for t in text]
349
+
350
+ result = super()._encode_plus(text, text_pair=text_pair, add_special_tokens=True, **kwargs)
351
+ self.set_infilling_processor(True)
352
+ return result
353
+
354
+
355
+ __all__ = ["CodeLlamaTokenizer", "CodeLlamaTokenizerFast"]
356
+
357
+ # Backward alias
358
+ CodeLlamaTokenizerFast = CodeLlamaTokenizer
LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/transformers/models/mbart/__init__.py ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright 2024 The HuggingFace Team. All rights reserved.
2
+ #
3
+ # Licensed under the Apache License, Version 2.0 (the "License");
4
+ # you may not use this file except in compliance with the License.
5
+ # You may obtain a copy of the License at
6
+ #
7
+ # http://www.apache.org/licenses/LICENSE-2.0
8
+ #
9
+ # Unless required by applicable law or agreed to in writing, software
10
+ # distributed under the License is distributed on an "AS IS" BASIS,
11
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ # See the License for the specific language governing permissions and
13
+ # limitations under the License.
14
+ from typing import TYPE_CHECKING
15
+
16
+ from ...utils import _LazyModule
17
+ from ...utils.import_utils import define_import_structure
18
+
19
+
20
+ if TYPE_CHECKING:
21
+ from .configuration_mbart import *
22
+ from .modeling_mbart import *
23
+ from .tokenization_mbart import *
24
+ else:
25
+ import sys
26
+
27
+ _file = globals()["__file__"]
28
+ sys.modules[__name__] = _LazyModule(__name__, _file, define_import_structure(_file), module_spec=__spec__)
LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/transformers/models/mbart/tokenization_mbart.py ADDED
@@ -0,0 +1,208 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright 2020 The Facebook AI Research Team Authors and The HuggingFace Inc. team.
2
+ #
3
+ # Licensed under the Apache License, Version 2.0 (the "License");
4
+ # you may not use this file except in compliance with the License.
5
+ # You may obtain a copy of the License at
6
+ #
7
+ # http://www.apache.org/licenses/LICENSE-2.0
8
+ #
9
+ # Unless required by applicable law or agreed to in writing, software
10
+ # distributed under the License is distributed on an "AS IS" BASIS,
11
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ # See the License for the specific language governing permissions and
13
+ # limitations under the License.
14
+
15
+
16
+ from tokenizers import Tokenizer, decoders, pre_tokenizers, processors
17
+ from tokenizers.models import Unigram
18
+
19
+ from ...tokenization_python import AddedToken
20
+ from ...tokenization_utils_tokenizers import TokenizersBackend
21
+ from ...utils import logging
22
+
23
+
24
+ logger = logging.get_logger(__name__)
25
+
26
+
27
+ VOCAB_FILES_NAMES = {"vocab_file": "sentencepiece.bpe.model", "tokenizer_file": "tokenizer.json"}
28
+
29
+
30
+ FAIRSEQ_LANGUAGE_CODES = ["ar_AR", "cs_CZ", "de_DE", "en_XX", "es_XX", "et_EE", "fi_FI", "fr_XX", "gu_IN", "hi_IN", "it_IT", "ja_XX", "kk_KZ", "ko_KR", "lt_LT", "lv_LV", "my_MM", "ne_NP", "nl_XX", "ro_RO", "ru_RU", "si_LK", "tr_TR", "vi_VN", "zh_CN"] # fmt: skip
31
+
32
+
33
+ class MBartTokenizer(TokenizersBackend):
34
+ """
35
+ Construct an MBART tokenizer (backed by HuggingFace's *tokenizers* library). Based on
36
+ [Unigram](https://huggingface.co/docs/tokenizers/python/latest/components.html?highlight=unigram#models).
37
+
38
+ This tokenizer inherits from [`TokenizersBackend`] which contains most of the main methods. Users should
39
+ refer to this superclass for more information regarding those methods.
40
+
41
+ The tokenization method is `<tokens> <eos> <language code>` for source language documents, and `<language code>
42
+ <tokens> <eos>` for target language documents.
43
+
44
+ Examples:
45
+
46
+ ```python
47
+ >>> from transformers import MBartTokenizer
48
+
49
+ >>> tokenizer = MBartTokenizer.from_pretrained(
50
+ ... "facebook/mbart-large-en-ro", src_lang="en_XX", tgt_lang="ro_RO"
51
+ ... )
52
+ >>> example_english_phrase = " UN Chief Says There Is No Military Solution in Syria"
53
+ >>> expected_translation_romanian = "Şeful ONU declară că nu există o soluţie militară în Siria"
54
+ >>> inputs = tokenizer(example_english_phrase, text_target=expected_translation_romanian, return_tensors="pt")
55
+ ```"""
56
+
57
+ vocab_files_names = VOCAB_FILES_NAMES
58
+ model_input_names = ["input_ids", "attention_mask"]
59
+ model = Unigram
60
+
61
+ prefix_tokens: list[int] = []
62
+ suffix_tokens: list[int] = []
63
+
64
+ def __init__(
65
+ self,
66
+ vocab: str | dict | list | None = None,
67
+ bos_token="<s>",
68
+ eos_token="</s>",
69
+ sep_token="</s>",
70
+ cls_token="<s>",
71
+ unk_token="<unk>",
72
+ pad_token="<pad>",
73
+ mask_token="<mask>",
74
+ src_lang=None,
75
+ tgt_lang=None,
76
+ additional_special_tokens=None,
77
+ **kwargs,
78
+ ):
79
+ mask_token = AddedToken(mask_token, lstrip=True, rstrip=False) if isinstance(mask_token, str) else mask_token
80
+
81
+ _additional_special_tokens = FAIRSEQ_LANGUAGE_CODES.copy()
82
+ if additional_special_tokens is not None:
83
+ _additional_special_tokens.extend(
84
+ [t for t in additional_special_tokens if t not in _additional_special_tokens]
85
+ )
86
+
87
+ if vocab is None:
88
+ vocab = [
89
+ (str(bos_token), 0.0),
90
+ (str(pad_token), 0.0),
91
+ (str(eos_token), 0.0),
92
+ (str(unk_token), 0.0),
93
+ ]
94
+ vocab += [("▁", -2.0)]
95
+ for lang_code in FAIRSEQ_LANGUAGE_CODES:
96
+ vocab.append((lang_code, 0.0))
97
+ vocab.append((str(mask_token), 0.0))
98
+
99
+ self._vocab = vocab
100
+ self._tokenizer = Tokenizer(Unigram(self._vocab, unk_id=3, byte_fallback=False))
101
+
102
+ self._tokenizer.normalizer = None
103
+
104
+ self._tokenizer.pre_tokenizer = pre_tokenizers.Sequence(
105
+ [
106
+ pre_tokenizers.WhitespaceSplit(),
107
+ pre_tokenizers.Metaspace(replacement="▁", prepend_scheme="always", split=True),
108
+ ]
109
+ )
110
+
111
+ self._tokenizer.decoder = decoders.Metaspace(replacement="▁", prepend_scheme="always", split=True)
112
+
113
+ super().__init__(
114
+ bos_token=bos_token,
115
+ eos_token=eos_token,
116
+ sep_token=sep_token,
117
+ cls_token=cls_token,
118
+ unk_token=unk_token,
119
+ pad_token=pad_token,
120
+ mask_token=mask_token,
121
+ src_lang=src_lang,
122
+ tgt_lang=tgt_lang,
123
+ additional_special_tokens=_additional_special_tokens,
124
+ **kwargs,
125
+ )
126
+
127
+ self.lang_code_to_id = {
128
+ lang_code: self.convert_tokens_to_ids(lang_code) for lang_code in FAIRSEQ_LANGUAGE_CODES
129
+ }
130
+ self.fairseq_offset = 1
131
+
132
+ # Build fairseq token mappings for backward compatibility
133
+ self.fairseq_tokens_to_ids = {
134
+ "<s>": 0,
135
+ "<pad>": 1,
136
+ "</s>": 2,
137
+ "<unk>": 3,
138
+ }
139
+ self.fairseq_tokens_to_ids.update(self.lang_code_to_id)
140
+ self.fairseq_tokens_to_ids["<mask>"] = self.convert_tokens_to_ids(str(mask_token))
141
+ self.fairseq_ids_to_tokens = {v: k for k, v in self.fairseq_tokens_to_ids.items()}
142
+
143
+ self._src_lang = src_lang if src_lang is not None else "en_XX"
144
+ self.cur_lang_code = self.convert_tokens_to_ids(self._src_lang)
145
+ self.tgt_lang = tgt_lang
146
+ self.set_src_lang_special_tokens(self._src_lang)
147
+
148
+ @property
149
+ def src_lang(self) -> str:
150
+ return self._src_lang
151
+
152
+ @src_lang.setter
153
+ def src_lang(self, new_src_lang: str) -> None:
154
+ self._src_lang = new_src_lang
155
+ self.set_src_lang_special_tokens(self._src_lang)
156
+
157
+ def _build_translation_inputs(
158
+ self, raw_inputs, return_tensors: str, src_lang: str | None, tgt_lang: str | None, **extra_kwargs
159
+ ):
160
+ """Used by translation pipeline, to prepare inputs for the generate function"""
161
+ if src_lang is None or tgt_lang is None:
162
+ raise ValueError("Translation requires a `src_lang` and a `tgt_lang` for this model")
163
+ self.src_lang = src_lang
164
+ inputs = self(raw_inputs, add_special_tokens=True, return_tensors=return_tensors, **extra_kwargs)
165
+ tgt_lang_id = self.convert_tokens_to_ids(tgt_lang)
166
+ inputs["forced_bos_token_id"] = tgt_lang_id
167
+ return inputs
168
+
169
+ def _switch_to_input_mode(self):
170
+ return self.set_src_lang_special_tokens(self.src_lang)
171
+
172
+ def _switch_to_target_mode(self):
173
+ if self.tgt_lang is None:
174
+ self.tgt_lang = self._src_lang
175
+ return self.set_tgt_lang_special_tokens(self.tgt_lang)
176
+
177
+ def set_src_lang_special_tokens(self, src_lang) -> None:
178
+ """Reset the special tokens to the source lang setting. No prefix and suffix=[eos, src_lang_code]."""
179
+ self.cur_lang_code = self.convert_tokens_to_ids(src_lang)
180
+ self.prefix_tokens = []
181
+ self.suffix_tokens = [self.eos_token_id, self.cur_lang_code]
182
+
183
+ prefix_tokens_str = self.convert_ids_to_tokens(self.prefix_tokens)
184
+ suffix_tokens_str = self.convert_ids_to_tokens(self.suffix_tokens)
185
+
186
+ self._tokenizer.post_processor = processors.TemplateProcessing(
187
+ single=prefix_tokens_str + ["$A"] + suffix_tokens_str,
188
+ pair=prefix_tokens_str + ["$A", "$B"] + suffix_tokens_str,
189
+ special_tokens=list(zip(prefix_tokens_str + suffix_tokens_str, self.prefix_tokens + self.suffix_tokens)),
190
+ )
191
+
192
+ def set_tgt_lang_special_tokens(self, lang: str) -> None:
193
+ """Reset the special tokens to the target language setting. No prefix and suffix=[eos, tgt_lang_code]."""
194
+ self.cur_lang_code = self.convert_tokens_to_ids(lang)
195
+ self.prefix_tokens = []
196
+ self.suffix_tokens = [self.eos_token_id, self.cur_lang_code]
197
+
198
+ prefix_tokens_str = self.convert_ids_to_tokens(self.prefix_tokens)
199
+ suffix_tokens_str = self.convert_ids_to_tokens(self.suffix_tokens)
200
+
201
+ self._tokenizer.post_processor = processors.TemplateProcessing(
202
+ single=prefix_tokens_str + ["$A"] + suffix_tokens_str,
203
+ pair=prefix_tokens_str + ["$A", "$B"] + suffix_tokens_str,
204
+ special_tokens=list(zip(prefix_tokens_str + suffix_tokens_str, self.prefix_tokens + self.suffix_tokens)),
205
+ )
206
+
207
+
208
+ __all__ = ["MBartTokenizer"]
LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typer-0.25.1.dist-info/INSTALLER ADDED
@@ -0,0 +1 @@
 
 
1
+ uv
LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typer-0.25.1.dist-info/METADATA ADDED
@@ -0,0 +1,408 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Metadata-Version: 2.4
2
+ Name: typer
3
+ Version: 0.25.1
4
+ Summary: Typer, build great CLIs. Easy to code. Based on Python type hints.
5
+ Author-Email: =?utf-8?q?Sebasti=C3=A1n_Ram=C3=ADrez?= <tiangolo@gmail.com>
6
+ License-Expression: MIT
7
+ License-File: LICENSE
8
+ Classifier: Intended Audience :: Information Technology
9
+ Classifier: Intended Audience :: System Administrators
10
+ Classifier: Operating System :: OS Independent
11
+ Classifier: Programming Language :: Python :: 3
12
+ Classifier: Programming Language :: Python
13
+ Classifier: Topic :: Software Development :: Libraries :: Application Frameworks
14
+ Classifier: Topic :: Software Development :: Libraries :: Python Modules
15
+ Classifier: Topic :: Software Development :: Libraries
16
+ Classifier: Topic :: Software Development
17
+ Classifier: Typing :: Typed
18
+ Classifier: Development Status :: 4 - Beta
19
+ Classifier: Intended Audience :: Developers
20
+ Classifier: Programming Language :: Python :: 3 :: Only
21
+ Classifier: Programming Language :: Python :: 3.10
22
+ Classifier: Programming Language :: Python :: 3.11
23
+ Classifier: Programming Language :: Python :: 3.12
24
+ Classifier: Programming Language :: Python :: 3.13
25
+ Classifier: Programming Language :: Python :: 3.14
26
+ Project-URL: Homepage, https://github.com/fastapi/typer
27
+ Project-URL: Documentation, https://typer.tiangolo.com
28
+ Project-URL: Repository, https://github.com/fastapi/typer
29
+ Project-URL: Issues, https://github.com/fastapi/typer/issues
30
+ Project-URL: Changelog, https://typer.tiangolo.com/release-notes/
31
+ Requires-Python: >=3.10
32
+ Requires-Dist: click>=8.2.1
33
+ Requires-Dist: shellingham>=1.3.0
34
+ Requires-Dist: rich>=13.8.0
35
+ Requires-Dist: annotated-doc>=0.0.2
36
+ Description-Content-Type: text/markdown
37
+
38
+ <p align="center">
39
+ <a href="https://typer.tiangolo.com"><img src="https://typer.tiangolo.com/img/logo-margin/logo-margin-vector.svg#only-light" alt="Typer"></a>
40
+
41
+ </p>
42
+ <p align="center">
43
+ <em>Typer, build great CLIs. Easy to code. Based on Python type hints.</em>
44
+ </p>
45
+ <p align="center">
46
+ <a href="https://github.com/fastapi/typer/actions?query=workflow%3ATest+event%3Apush+branch%3Amaster" target="_blank">
47
+ <img src="https://github.com/fastapi/typer/actions/workflows/test.yml/badge.svg?event=push&branch=master" alt="Test">
48
+ </a>
49
+ <a href="https://github.com/fastapi/typer/actions?query=workflow%3APublish" target="_blank">
50
+ <img src="https://github.com/fastapi/typer/workflows/Publish/badge.svg" alt="Publish">
51
+ </a>
52
+ <a href="https://coverage-badge.samuelcolvin.workers.dev/redirect/fastapi/typer" target="_blank">
53
+ <img src="https://coverage-badge.samuelcolvin.workers.dev/fastapi/typer.svg" alt="Coverage">
54
+ <a href="https://pypi.org/project/typer" target="_blank">
55
+ <img src="https://img.shields.io/pypi/v/typer?color=%2334D058&label=pypi%20package" alt="Package version">
56
+ </a>
57
+ </p>
58
+
59
+ ---
60
+
61
+ **Documentation**: <a href="https://typer.tiangolo.com" target="_blank">https://typer.tiangolo.com</a>
62
+
63
+ **Source Code**: <a href="https://github.com/fastapi/typer" target="_blank">https://github.com/fastapi/typer</a>
64
+
65
+ ---
66
+
67
+ Typer is a library for building <abbr title="command line interface, programs executed from a terminal">CLI</abbr> applications that users will **love using** and developers will **love creating**. Based on Python type hints.
68
+
69
+ It's also a command line tool to run scripts, automatically converting them to CLI applications.
70
+
71
+ The key features are:
72
+
73
+ * **Intuitive to write**: Great editor support. <abbr title="also known as auto-complete, autocompletion, IntelliSense">Completion</abbr> everywhere. Less time debugging. Designed to be easy to use and learn. Less time reading docs.
74
+ * **Easy to use**: It's easy to use for the final users. Automatic help, and automatic completion for all shells.
75
+ * **Short**: Minimize code duplication. Multiple features from each parameter declaration. Fewer bugs.
76
+ * **Start simple**: The simplest example adds only 2 lines of code to your app: **1 import, 1 function call**.
77
+ * **Grow large**: Grow in complexity as much as you want, create arbitrarily complex trees of commands and groups of subcommands, with options and arguments.
78
+ * **Run scripts**: Typer includes a `typer` command/program that you can use to run scripts, automatically converting them to CLIs, even if they don't use Typer internally.
79
+
80
+ ## FastAPI of CLIs
81
+
82
+ **Typer** is <a href="https://fastapi.tiangolo.com" class="external-link" target="_blank">FastAPI</a>'s little sibling, it's the FastAPI of CLIs.
83
+
84
+ ## Installation
85
+
86
+ Create and activate a <a href="https://typer.tiangolo.com/virtual-environments/" class="external-link" target="_blank">virtual environment</a> and then install **Typer**:
87
+
88
+ <div class="termy">
89
+
90
+ ```console
91
+ $ pip install typer
92
+ ---> 100%
93
+ Successfully installed typer rich shellingham
94
+ ```
95
+
96
+ </div>
97
+
98
+ ## Example
99
+
100
+ ### The absolute minimum
101
+
102
+ * Create a file `main.py` with:
103
+
104
+ ```Python
105
+ def main(name: str):
106
+ print(f"Hello {name}")
107
+ ```
108
+
109
+ This script doesn't even use Typer internally. But you can use the `typer` command to run it as a CLI application.
110
+
111
+ ### Run it
112
+
113
+ Run your application with the `typer` command:
114
+
115
+ <div class="termy">
116
+
117
+ ```console
118
+ // Run your application
119
+ $ typer main.py run
120
+
121
+ // You get a nice error, you are missing NAME
122
+ Usage: typer [PATH_OR_MODULE] run [OPTIONS] NAME
123
+ Try 'typer [PATH_OR_MODULE] run --help' for help.
124
+ ╭─ Error ───────────────────────────────────────────╮
125
+ │ Missing argument 'NAME'. │
126
+ ╰───────────────────────────────────────────────────╯
127
+
128
+
129
+ // You get a --help for free
130
+ $ typer main.py run --help
131
+
132
+ Usage: typer [PATH_OR_MODULE] run [OPTIONS] NAME
133
+
134
+ Run the provided Typer app.
135
+
136
+ ╭─ Arguments ───────────────────────────────────────╮
137
+ │ * name TEXT [default: None] [required] |
138
+ ╰───────────────────────────────────────────────────╯
139
+ ╭─ Options ─────────────────────────────────────────╮
140
+ │ --help Show this message and exit. │
141
+ ╰───────────────────────────────────────────────────╯
142
+
143
+ // Now pass the NAME argument
144
+ $ typer main.py run Camila
145
+
146
+ Hello Camila
147
+
148
+ // It works! 🎉
149
+ ```
150
+
151
+ </div>
152
+
153
+ This is the simplest use case, not even using Typer internally, but it can already be quite useful for simple scripts.
154
+
155
+ **Note**: auto-completion works when you create a Python package and run it with `--install-completion` or when you use the `typer` command.
156
+
157
+ ## Use Typer in your code
158
+
159
+ Now let's start using Typer in your own code, update `main.py` with:
160
+
161
+ ```Python
162
+ import typer
163
+
164
+
165
+ def main(name: str):
166
+ print(f"Hello {name}")
167
+
168
+
169
+ if __name__ == "__main__":
170
+ typer.run(main)
171
+ ```
172
+
173
+ Now you could run it with Python directly:
174
+
175
+ <div class="termy">
176
+
177
+ ```console
178
+ // Run your application
179
+ $ python main.py
180
+
181
+ // You get a nice error, you are missing NAME
182
+ Usage: main.py [OPTIONS] NAME
183
+ Try 'main.py --help' for help.
184
+ ╭─ Error ───────────────────────────────────────────╮
185
+ │ Missing argument 'NAME'. │
186
+ ╰───────────────────────────────────────────────────╯
187
+
188
+
189
+ // You get a --help for free
190
+ $ python main.py --help
191
+
192
+ Usage: main.py [OPTIONS] NAME
193
+
194
+ ╭─ Arguments ───────────────────────────────────────╮
195
+ │ * name TEXT [default: None] [required] |
196
+ ╰───────────────────────────────────────────────────╯
197
+ ╭─ Options ─────────────────────────────────────────╮
198
+ │ --help Show this message and exit. │
199
+ ╰───────────────────────────────────────────────────╯
200
+
201
+ // Now pass the NAME argument
202
+ $ python main.py Camila
203
+
204
+ Hello Camila
205
+
206
+ // It works! 🎉
207
+ ```
208
+
209
+ </div>
210
+
211
+ **Note**: you can also call this same script with the `typer` command, but you don't need to.
212
+
213
+ ## Example upgrade
214
+
215
+ This was the simplest example possible.
216
+
217
+ Now let's see one a bit more complex.
218
+
219
+ ### An example with two subcommands
220
+
221
+ Modify the file `main.py`.
222
+
223
+ Create a `typer.Typer()` app, and create two subcommands with their parameters.
224
+
225
+ ```Python hl_lines="3 6 11 20"
226
+ import typer
227
+
228
+ app = typer.Typer()
229
+
230
+
231
+ @app.command()
232
+ def hello(name: str):
233
+ print(f"Hello {name}")
234
+
235
+
236
+ @app.command()
237
+ def goodbye(name: str, formal: bool = False):
238
+ if formal:
239
+ print(f"Goodbye Ms. {name}. Have a good day.")
240
+ else:
241
+ print(f"Bye {name}!")
242
+
243
+
244
+ if __name__ == "__main__":
245
+ app()
246
+ ```
247
+
248
+ And that will:
249
+
250
+ * Explicitly create a `typer.Typer` app.
251
+ * The previous `typer.run` actually creates one implicitly for you.
252
+ * Add two subcommands with `@app.command()`.
253
+ * Execute the `app()` itself, as if it was a function (instead of `typer.run`).
254
+
255
+ ### Run the upgraded example
256
+
257
+ Check the new help:
258
+
259
+ <div class="termy">
260
+
261
+ ```console
262
+ $ python main.py --help
263
+
264
+ Usage: main.py [OPTIONS] COMMAND [ARGS]...
265
+
266
+ ╭─ Options ─────────────────────────────────────────╮
267
+ │ --install-completion Install completion │
268
+ │ for the current │
269
+ │ shell. │
270
+ │ --show-completion Show completion for │
271
+ │ the current shell, │
272
+ │ to copy it or │
273
+ │ customize the │
274
+ │ installation. │
275
+ │ --help Show this message │
276
+ │ and exit. │
277
+ ╰───────────────────────────────────────────────────╯
278
+ ╭─ Commands ────────────────────────────────────────╮
279
+ │ goodbye │
280
+ │ hello │
281
+ ╰───────────────────────────────────────────────────╯
282
+
283
+ // When you create a package you get ✨ auto-completion ✨ for free, installed with --install-completion
284
+
285
+ // You have 2 subcommands (the 2 functions): goodbye and hello
286
+ ```
287
+
288
+ </div>
289
+
290
+ Now check the help for the `hello` command:
291
+
292
+ <div class="termy">
293
+
294
+ ```console
295
+ $ python main.py hello --help
296
+
297
+ Usage: main.py hello [OPTIONS] NAME
298
+
299
+ ╭─ Arguments ───────────────────────────────────────╮
300
+ │ * name TEXT [default: None] [required] │
301
+ ╰───────────────────────────────────────────────────╯
302
+ ╭─ Options ─────────────────────────────────────────╮
303
+ │ --help Show this message and exit. │
304
+ ╰───────────────────────────────────────────────────╯
305
+ ```
306
+
307
+ </div>
308
+
309
+ And now check the help for the `goodbye` command:
310
+
311
+ <div class="termy">
312
+
313
+ ```console
314
+ $ python main.py goodbye --help
315
+
316
+ Usage: main.py goodbye [OPTIONS] NAME
317
+
318
+ ╭─ Arguments ───────────────────────────────────────╮
319
+ │ * name TEXT [default: None] [required] │
320
+ ╰───────────────────────────────────────────────────╯
321
+ ╭─ Options ─────────────────────────────────────────╮
322
+ │ --formal --no-formal [default: no-formal] │
323
+ │ --help Show this message │
324
+ │ and exit. │
325
+ ╰───────────────────────────────────────────────────╯
326
+
327
+ // Automatic --formal and --no-formal for the bool option 🎉
328
+ ```
329
+
330
+ </div>
331
+
332
+ Now you can try out the new command line application:
333
+
334
+ <div class="termy">
335
+
336
+ ```console
337
+ // Use it with the hello command
338
+
339
+ $ python main.py hello Camila
340
+
341
+ Hello Camila
342
+
343
+ // And with the goodbye command
344
+
345
+ $ python main.py goodbye Camila
346
+
347
+ Bye Camila!
348
+
349
+ // And with --formal
350
+
351
+ $ python main.py goodbye --formal Camila
352
+
353
+ Goodbye Ms. Camila. Have a good day.
354
+ ```
355
+
356
+ </div>
357
+
358
+ **Note**: If your app only has one command, by default the command name is **omitted** in usage: `python main.py Camila`. However, when there are multiple commands, you must **explicitly include the command name**: `python main.py hello Camila`. See [One or Multiple Commands](https://typer.tiangolo.com/tutorial/commands/one-or-multiple/) for more details.
359
+
360
+ ### Recap
361
+
362
+ In summary, you declare **once** the types of parameters (*CLI arguments* and *CLI options*) as function parameters.
363
+
364
+ You do that with standard modern Python types.
365
+
366
+ You don't have to learn a new syntax, the methods or classes of a specific library, etc.
367
+
368
+ Just standard **Python**.
369
+
370
+ For example, for an `int`:
371
+
372
+ ```Python
373
+ total: int
374
+ ```
375
+
376
+ or for a `bool` flag:
377
+
378
+ ```Python
379
+ force: bool
380
+ ```
381
+
382
+ And similarly for **files**, **paths**, **enums** (choices), etc. And there are tools to create **groups of subcommands**, add metadata, extra **validation**, etc.
383
+
384
+ **You get**: great editor support, including **completion** and **type checks** everywhere.
385
+
386
+ **Your users get**: automatic **`--help`**, **auto-completion** in their terminal (Bash, Zsh, Fish, PowerShell) when they install your package or when using the `typer` command.
387
+
388
+ For a more complete example including more features, see the <a href="https://typer.tiangolo.com/tutorial/">Tutorial - User Guide</a>.
389
+
390
+ ## Dependencies
391
+
392
+ **Typer** stands on the shoulders of giants. It has three required dependencies:
393
+
394
+ * <a href="https://click.palletsprojects.com/" class="external-link" target="_blank">Click</a>: a popular tool for building CLIs in Python. Typer is based on it.
395
+ * <a href="https://rich.readthedocs.io/en/stable/index.html" class="external-link" target="_blank"><code>rich</code></a>: to show nicely formatted errors automatically.
396
+ * <a href="https://github.com/sarugaku/shellingham" class="external-link" target="_blank"><code>shellingham</code></a>: to automatically detect the current shell when installing completion.
397
+
398
+ ### `typer-slim`
399
+
400
+ There used to be a slimmed-down version of Typer called `typer-slim`, which didn't include the dependencies `rich` and `shellingham`, nor the `typer` command.
401
+
402
+ However, since version 0.22.0, we have stopped supporting this, and `typer-slim` now simply installs (all of) Typer.
403
+
404
+ If you want to disable Rich globally, you can set an environmental variable `TYPER_USE_RICH` to `False` or `0`.
405
+
406
+ ## License
407
+
408
+ This project is licensed under the terms of the MIT license.
LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typer-0.25.1.dist-info/RECORD ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ../../../bin/typer,sha256=qXV4e3jpdEWwXNLcaK9G1c18LP7_vMiI-T6i3Gp7Dio,390
2
+ typer-0.25.1.dist-info/INSTALLER,sha256=5hhM4Q4mYTT9z6QB6PGpUAW81PGNFrYrdXMj4oM_6ak,2
3
+ typer-0.25.1.dist-info/METADATA,sha256=MEVT4brybxa876W1Ldfa2GnnULWWff13S4lmdI9dNv8,15852
4
+ typer-0.25.1.dist-info/RECORD,,
5
+ typer-0.25.1.dist-info/REQUESTED,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
6
+ typer-0.25.1.dist-info/WHEEL,sha256=Z36eTX6lG3PITRleSd5hAZHCcz52yg3c0JQVxKBbLW0,90
7
+ typer-0.25.1.dist-info/entry_points.txt,sha256=YO13ByiqWeuas9V0JADLUARZFUe_cwU_7wmTNvxBYQ8,57
8
+ typer-0.25.1.dist-info/licenses/LICENSE,sha256=WJks68-N-25AxOIRLtEhJsJDZm3KORKj14t-ysSFnUk,1086
9
+ typer/.agents/skills/typer/SKILL.md,sha256=50o03_KXs_ZhTefLbKpNzEsqLe8cPXXQY7CBe4GjzVM,6172
10
+ typer/__init__.py,sha256=Dlk8QTcly2VA0UDh792Rk_XG8N9HZi9yucy_RQRJcCY,1596
11
+ typer/__main__.py,sha256=bYt9eEaoRQWdejEHFD8REx9jxVEdZptECFsV7F49Ink,30
12
+ typer/_completion_classes.py,sha256=R9v4D8pJ_-n8fLOuyxrRSu7sP5lpXIy5fsLUW8zwsDU,7039
13
+ typer/_completion_shared.py,sha256=-uhCUIMc2S1ywdB-fBSSccH70mIBEsVTxHomcmy-klE,9129
14
+ typer/_types.py,sha256=0lcBDLcsxqr1sxTsqObj_u0Dfa37lWJYUY4PNkX4QlA,974
15
+ typer/_typing.py,sha256=QOw5o-B2L--C3ly2DQH6aUwag6x5brV5FhVaBZ5gzMg,1727
16
+ typer/cli.py,sha256=eg4wafz-7dB-PZqQcRunDbSucc3gSGRYsc834S69d80,10211
17
+ typer/colors.py,sha256=e42j8uB520hLpX5C_0fiR3OOoIFMbhO3ADZvv6hlAV8,430
18
+ typer/completion.py,sha256=FRTR9hP_IPdJp-4GXPOq0btXo5SvgAtLVfS3ZkAMpgQ,4793
19
+ typer/core.py,sha256=N74AEwGx0PbYN6vzimWxqJbd-O5ugJ5_FXjBhRngXQo,27809
20
+ typer/main.py,sha256=XUBRkapHMd5dQNIZhQBHO0AE-RPTV3m5Xs8AW20CovY,69010
21
+ typer/models.py,sha256=OwPG3MAXiUD5ih3p8eNVciXUsL07UIJfNWy3JiNpDfg,19843
22
+ typer/params.py,sha256=AovViRtl-VvUIXnmKKpnxoWK9_gHUbyQgXxxv3h_7lI,59713
23
+ typer/py.typed,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
24
+ typer/rich_utils.py,sha256=lqxrRzQLtnwVdNiDaNcAGPzvSxRYpWAtPHwPEBXLczU,25473
25
+ typer/testing.py,sha256=-ovLNjUNNEFCJoau-41iTJIobsjPbqyTrRq7-8ac4z4,871
26
+ typer/utils.py,sha256=wnJ1DWXBFMnxLHaMN_HDYntxLRby0K-rux63aokHInI,7599
LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typer-0.25.1.dist-info/REQUESTED ADDED
File without changes
LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typer-0.25.1.dist-info/WHEEL ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ Wheel-Version: 1.0
2
+ Generator: pdm-backend (2.4.8)
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typer-0.25.1.dist-info/entry_points.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ [console_scripts]
2
+ typer = typer.cli:main
3
+
4
+ [gui_scripts]
5
+
LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typing_extensions-4.15.0.dist-info/METADATA ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Metadata-Version: 2.4
2
+ Name: typing_extensions
3
+ Version: 4.15.0
4
+ Summary: Backported and Experimental Type Hints for Python 3.9+
5
+ Keywords: annotations,backport,checker,checking,function,hinting,hints,type,typechecking,typehinting,typehints,typing
6
+ Author-email: "Guido van Rossum, Jukka Lehtosalo, Łukasz Langa, Michael Lee" <levkivskyi@gmail.com>
7
+ Requires-Python: >=3.9
8
+ Description-Content-Type: text/markdown
9
+ License-Expression: PSF-2.0
10
+ Classifier: Development Status :: 5 - Production/Stable
11
+ Classifier: Environment :: Console
12
+ Classifier: Intended Audience :: Developers
13
+ Classifier: Operating System :: OS Independent
14
+ Classifier: Programming Language :: Python :: 3
15
+ Classifier: Programming Language :: Python :: 3 :: Only
16
+ Classifier: Programming Language :: Python :: 3.9
17
+ Classifier: Programming Language :: Python :: 3.10
18
+ Classifier: Programming Language :: Python :: 3.11
19
+ Classifier: Programming Language :: Python :: 3.12
20
+ Classifier: Programming Language :: Python :: 3.13
21
+ Classifier: Programming Language :: Python :: 3.14
22
+ Classifier: Topic :: Software Development
23
+ License-File: LICENSE
24
+ Project-URL: Bug Tracker, https://github.com/python/typing_extensions/issues
25
+ Project-URL: Changes, https://github.com/python/typing_extensions/blob/main/CHANGELOG.md
26
+ Project-URL: Documentation, https://typing-extensions.readthedocs.io/
27
+ Project-URL: Home, https://github.com/python/typing_extensions
28
+ Project-URL: Q & A, https://github.com/python/typing/discussions
29
+ Project-URL: Repository, https://github.com/python/typing_extensions
30
+
31
+ # Typing Extensions
32
+
33
+ [![Chat at https://gitter.im/python/typing](https://badges.gitter.im/python/typing.svg)](https://gitter.im/python/typing)
34
+
35
+ [Documentation](https://typing-extensions.readthedocs.io/en/latest/#) –
36
+ [PyPI](https://pypi.org/project/typing-extensions/)
37
+
38
+ ## Overview
39
+
40
+ The `typing_extensions` module serves two related purposes:
41
+
42
+ - Enable use of new type system features on older Python versions. For example,
43
+ `typing.TypeGuard` is new in Python 3.10, but `typing_extensions` allows
44
+ users on previous Python versions to use it too.
45
+ - Enable experimentation with new type system PEPs before they are accepted and
46
+ added to the `typing` module.
47
+
48
+ `typing_extensions` is treated specially by static type checkers such as
49
+ mypy and pyright. Objects defined in `typing_extensions` are treated the same
50
+ way as equivalent forms in `typing`.
51
+
52
+ `typing_extensions` uses
53
+ [Semantic Versioning](https://semver.org/). The
54
+ major version will be incremented only for backwards-incompatible changes.
55
+ Therefore, it's safe to depend
56
+ on `typing_extensions` like this: `typing_extensions ~=x.y`,
57
+ where `x.y` is the first version that includes all features you need.
58
+ [This](https://packaging.python.org/en/latest/specifications/version-specifiers/#compatible-release)
59
+ is equivalent to `typing_extensions >=x.y, <(x+1)`. Do not depend on `~= x.y.z`
60
+ unless you really know what you're doing; that defeats the purpose of
61
+ semantic versioning.
62
+
63
+ ## Included items
64
+
65
+ See [the documentation](https://typing-extensions.readthedocs.io/en/latest/#) for a
66
+ complete listing of module contents.
67
+
68
+ ## Contributing
69
+
70
+ See [CONTRIBUTING.md](https://github.com/python/typing_extensions/blob/main/CONTRIBUTING.md)
71
+ for how to contribute to `typing_extensions`.
72
+
LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typing_extensions-4.15.0.dist-info/RECORD ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ typing_extensions-4.15.0.dist-info/INSTALLER,sha256=5hhM4Q4mYTT9z6QB6PGpUAW81PGNFrYrdXMj4oM_6ak,2
2
+ typing_extensions-4.15.0.dist-info/METADATA,sha256=wTg3j-jxiTSsmd4GBTXFPsbBOu7WXpTDJkHafuMZKnI,3259
3
+ typing_extensions-4.15.0.dist-info/RECORD,,
4
+ typing_extensions-4.15.0.dist-info/REQUESTED,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
5
+ typing_extensions-4.15.0.dist-info/WHEEL,sha256=G2gURzTEtmeR8nrdXUJfNiB3VYVxigPQ-bEQujpNiNs,82
6
+ typing_extensions-4.15.0.dist-info/licenses/LICENSE,sha256=Oy-B_iHRgcSZxZolbI4ZaEVdZonSaaqFNzv7avQdo78,13936
7
+ typing_extensions.py,sha256=Qz0R0XDTok0usGXrwb_oSM6n49fOaFZ6tSvqLUwvftg,160429
LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typing_extensions-4.15.0.dist-info/REQUESTED ADDED
File without changes
LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typing_extensions-4.15.0.dist-info/WHEEL ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ Wheel-Version: 1.0
2
+ Generator: flit 3.12.0
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
LTA_openwebtext_dualt/mini_owt_logdirichlet/.venv_qwen35_uv/lib/python3.12/site-packages/typing_extensions-4.15.0.dist-info/licenses/LICENSE ADDED
@@ -0,0 +1,279 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ A. HISTORY OF THE SOFTWARE
2
+ ==========================
3
+
4
+ Python was created in the early 1990s by Guido van Rossum at Stichting
5
+ Mathematisch Centrum (CWI, see https://www.cwi.nl) in the Netherlands
6
+ as a successor of a language called ABC. Guido remains Python's
7
+ principal author, although it includes many contributions from others.
8
+
9
+ In 1995, Guido continued his work on Python at the Corporation for
10
+ National Research Initiatives (CNRI, see https://www.cnri.reston.va.us)
11
+ in Reston, Virginia where he released several versions of the
12
+ software.
13
+
14
+ In May 2000, Guido and the Python core development team moved to
15
+ BeOpen.com to form the BeOpen PythonLabs team. In October of the same
16
+ year, the PythonLabs team moved to Digital Creations, which became
17
+ Zope Corporation. In 2001, the Python Software Foundation (PSF, see
18
+ https://www.python.org/psf/) was formed, a non-profit organization
19
+ created specifically to own Python-related Intellectual Property.
20
+ Zope Corporation was a sponsoring member of the PSF.
21
+
22
+ All Python releases are Open Source (see https://opensource.org for
23
+ the Open Source Definition). Historically, most, but not all, Python
24
+ releases have also been GPL-compatible; the table below summarizes
25
+ the various releases.
26
+
27
+ Release Derived Year Owner GPL-
28
+ from compatible? (1)
29
+
30
+ 0.9.0 thru 1.2 1991-1995 CWI yes
31
+ 1.3 thru 1.5.2 1.2 1995-1999 CNRI yes
32
+ 1.6 1.5.2 2000 CNRI no
33
+ 2.0 1.6 2000 BeOpen.com no
34
+ 1.6.1 1.6 2001 CNRI yes (2)
35
+ 2.1 2.0+1.6.1 2001 PSF no
36
+ 2.0.1 2.0+1.6.1 2001 PSF yes
37
+ 2.1.1 2.1+2.0.1 2001 PSF yes
38
+ 2.1.2 2.1.1 2002 PSF yes
39
+ 2.1.3 2.1.2 2002 PSF yes
40
+ 2.2 and above 2.1.1 2001-now PSF yes
41
+
42
+ Footnotes:
43
+
44
+ (1) GPL-compatible doesn't mean that we're distributing Python under
45
+ the GPL. All Python licenses, unlike the GPL, let you distribute
46
+ a modified version without making your changes open source. The
47
+ GPL-compatible licenses make it possible to combine Python with
48
+ other software that is released under the GPL; the others don't.
49
+
50
+ (2) According to Richard Stallman, 1.6.1 is not GPL-compatible,
51
+ because its license has a choice of law clause. According to
52
+ CNRI, however, Stallman's lawyer has told CNRI's lawyer that 1.6.1
53
+ is "not incompatible" with the GPL.
54
+
55
+ Thanks to the many outside volunteers who have worked under Guido's
56
+ direction to make these releases possible.
57
+
58
+
59
+ B. TERMS AND CONDITIONS FOR ACCESSING OR OTHERWISE USING PYTHON
60
+ ===============================================================
61
+
62
+ Python software and documentation are licensed under the
63
+ Python Software Foundation License Version 2.
64
+
65
+ Starting with Python 3.8.6, examples, recipes, and other code in
66
+ the documentation are dual licensed under the PSF License Version 2
67
+ and the Zero-Clause BSD license.
68
+
69
+ Some software incorporated into Python is under different licenses.
70
+ The licenses are listed with code falling under that license.
71
+
72
+
73
+ PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2
74
+ --------------------------------------------
75
+
76
+ 1. This LICENSE AGREEMENT is between the Python Software Foundation
77
+ ("PSF"), and the Individual or Organization ("Licensee") accessing and
78
+ otherwise using this software ("Python") in source or binary form and
79
+ its associated documentation.
80
+
81
+ 2. Subject to the terms and conditions of this License Agreement, PSF hereby
82
+ grants Licensee a nonexclusive, royalty-free, world-wide license to reproduce,
83
+ analyze, test, perform and/or display publicly, prepare derivative works,
84
+ distribute, and otherwise use Python alone or in any derivative version,
85
+ provided, however, that PSF's License Agreement and PSF's notice of copyright,
86
+ i.e., "Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010,
87
+ 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023 Python Software Foundation;
88
+ All Rights Reserved" are retained in Python alone or in any derivative version
89
+ prepared by Licensee.
90
+
91
+ 3. In the event Licensee prepares a derivative work that is based on
92
+ or incorporates Python or any part thereof, and wants to make
93
+ the derivative work available to others as provided herein, then
94
+ Licensee hereby agrees to include in any such work a brief summary of
95
+ the changes made to Python.
96
+
97
+ 4. PSF is making Python available to Licensee on an "AS IS"
98
+ basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
99
+ IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND
100
+ DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS
101
+ FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON WILL NOT
102
+ INFRINGE ANY THIRD PARTY RIGHTS.
103
+
104
+ 5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON
105
+ FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS
106
+ A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON,
107
+ OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
108
+
109
+ 6. This License Agreement will automatically terminate upon a material
110
+ breach of its terms and conditions.
111
+
112
+ 7. Nothing in this License Agreement shall be deemed to create any
113
+ relationship of agency, partnership, or joint venture between PSF and
114
+ Licensee. This License Agreement does not grant permission to use PSF
115
+ trademarks or trade name in a trademark sense to endorse or promote
116
+ products or services of Licensee, or any third party.
117
+
118
+ 8. By copying, installing or otherwise using Python, Licensee
119
+ agrees to be bound by the terms and conditions of this License
120
+ Agreement.
121
+
122
+
123
+ BEOPEN.COM LICENSE AGREEMENT FOR PYTHON 2.0
124
+ -------------------------------------------
125
+
126
+ BEOPEN PYTHON OPEN SOURCE LICENSE AGREEMENT VERSION 1
127
+
128
+ 1. This LICENSE AGREEMENT is between BeOpen.com ("BeOpen"), having an
129
+ office at 160 Saratoga Avenue, Santa Clara, CA 95051, and the
130
+ Individual or Organization ("Licensee") accessing and otherwise using
131
+ this software in source or binary form and its associated
132
+ documentation ("the Software").
133
+
134
+ 2. Subject to the terms and conditions of this BeOpen Python License
135
+ Agreement, BeOpen hereby grants Licensee a non-exclusive,
136
+ royalty-free, world-wide license to reproduce, analyze, test, perform
137
+ and/or display publicly, prepare derivative works, distribute, and
138
+ otherwise use the Software alone or in any derivative version,
139
+ provided, however, that the BeOpen Python License is retained in the
140
+ Software, alone or in any derivative version prepared by Licensee.
141
+
142
+ 3. BeOpen is making the Software available to Licensee on an "AS IS"
143
+ basis. BEOPEN MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
144
+ IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, BEOPEN MAKES NO AND
145
+ DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS
146
+ FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE WILL NOT
147
+ INFRINGE ANY THIRD PARTY RIGHTS.
148
+
149
+ 4. BEOPEN SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF THE
150
+ SOFTWARE FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS
151
+ AS A RESULT OF USING, MODIFYING OR DISTRIBUTING THE SOFTWARE, OR ANY
152
+ DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
153
+
154
+ 5. This License Agreement will automatically terminate upon a material
155
+ breach of its terms and conditions.
156
+
157
+ 6. This License Agreement shall be governed by and interpreted in all
158
+ respects by the law of the State of California, excluding conflict of
159
+ law provisions. Nothing in this License Agreement shall be deemed to
160
+ create any relationship of agency, partnership, or joint venture
161
+ between BeOpen and Licensee. This License Agreement does not grant
162
+ permission to use BeOpen trademarks or trade names in a trademark
163
+ sense to endorse or promote products or services of Licensee, or any
164
+ third party. As an exception, the "BeOpen Python" logos available at
165
+ http://www.pythonlabs.com/logos.html may be used according to the
166
+ permissions granted on that web page.
167
+
168
+ 7. By copying, installing or otherwise using the software, Licensee
169
+ agrees to be bound by the terms and conditions of this License
170
+ Agreement.
171
+
172
+
173
+ CNRI LICENSE AGREEMENT FOR PYTHON 1.6.1
174
+ ---------------------------------------
175
+
176
+ 1. This LICENSE AGREEMENT is between the Corporation for National
177
+ Research Initiatives, having an office at 1895 Preston White Drive,
178
+ Reston, VA 20191 ("CNRI"), and the Individual or Organization
179
+ ("Licensee") accessing and otherwise using Python 1.6.1 software in
180
+ source or binary form and its associated documentation.
181
+
182
+ 2. Subject to the terms and conditions of this License Agreement, CNRI
183
+ hereby grants Licensee a nonexclusive, royalty-free, world-wide
184
+ license to reproduce, analyze, test, perform and/or display publicly,
185
+ prepare derivative works, distribute, and otherwise use Python 1.6.1
186
+ alone or in any derivative version, provided, however, that CNRI's
187
+ License Agreement and CNRI's notice of copyright, i.e., "Copyright (c)
188
+ 1995-2001 Corporation for National Research Initiatives; All Rights
189
+ Reserved" are retained in Python 1.6.1 alone or in any derivative
190
+ version prepared by Licensee. Alternately, in lieu of CNRI's License
191
+ Agreement, Licensee may substitute the following text (omitting the
192
+ quotes): "Python 1.6.1 is made available subject to the terms and
193
+ conditions in CNRI's License Agreement. This Agreement together with
194
+ Python 1.6.1 may be located on the internet using the following
195
+ unique, persistent identifier (known as a handle): 1895.22/1013. This
196
+ Agreement may also be obtained from a proxy server on the internet
197
+ using the following URL: http://hdl.handle.net/1895.22/1013".
198
+
199
+ 3. In the event Licensee prepares a derivative work that is based on
200
+ or incorporates Python 1.6.1 or any part thereof, and wants to make
201
+ the derivative work available to others as provided herein, then
202
+ Licensee hereby agrees to include in any such work a brief summary of
203
+ the changes made to Python 1.6.1.
204
+
205
+ 4. CNRI is making Python 1.6.1 available to Licensee on an "AS IS"
206
+ basis. CNRI MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
207
+ IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, CNRI MAKES NO AND
208
+ DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS
209
+ FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON 1.6.1 WILL NOT
210
+ INFRINGE ANY THIRD PARTY RIGHTS.
211
+
212
+ 5. CNRI SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON
213
+ 1.6.1 FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS
214
+ A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON 1.6.1,
215
+ OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
216
+
217
+ 6. This License Agreement will automatically terminate upon a material
218
+ breach of its terms and conditions.
219
+
220
+ 7. This License Agreement shall be governed by the federal
221
+ intellectual property law of the United States, including without
222
+ limitation the federal copyright law, and, to the extent such
223
+ U.S. federal law does not apply, by the law of the Commonwealth of
224
+ Virginia, excluding Virginia's conflict of law provisions.
225
+ Notwithstanding the foregoing, with regard to derivative works based
226
+ on Python 1.6.1 that incorporate non-separable material that was
227
+ previously distributed under the GNU General Public License (GPL), the
228
+ law of the Commonwealth of Virginia shall govern this License
229
+ Agreement only as to issues arising under or with respect to
230
+ Paragraphs 4, 5, and 7 of this License Agreement. Nothing in this
231
+ License Agreement shall be deemed to create any relationship of
232
+ agency, partnership, or joint venture between CNRI and Licensee. This
233
+ License Agreement does not grant permission to use CNRI trademarks or
234
+ trade name in a trademark sense to endorse or promote products or
235
+ services of Licensee, or any third party.
236
+
237
+ 8. By clicking on the "ACCEPT" button where indicated, or by copying,
238
+ installing or otherwise using Python 1.6.1, Licensee agrees to be
239
+ bound by the terms and conditions of this License Agreement.
240
+
241
+ ACCEPT
242
+
243
+
244
+ CWI LICENSE AGREEMENT FOR PYTHON 0.9.0 THROUGH 1.2
245
+ --------------------------------------------------
246
+
247
+ Copyright (c) 1991 - 1995, Stichting Mathematisch Centrum Amsterdam,
248
+ The Netherlands. All rights reserved.
249
+
250
+ Permission to use, copy, modify, and distribute this software and its
251
+ documentation for any purpose and without fee is hereby granted,
252
+ provided that the above copyright notice appear in all copies and that
253
+ both that copyright notice and this permission notice appear in
254
+ supporting documentation, and that the name of Stichting Mathematisch
255
+ Centrum or CWI not be used in advertising or publicity pertaining to
256
+ distribution of the software without specific, written prior
257
+ permission.
258
+
259
+ STICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD TO
260
+ THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
261
+ FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE
262
+ FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
263
+ WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
264
+ ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
265
+ OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
266
+
267
+ ZERO-CLAUSE BSD LICENSE FOR CODE IN THE PYTHON DOCUMENTATION
268
+ ----------------------------------------------------------------------
269
+
270
+ Permission to use, copy, modify, and/or distribute this software for any
271
+ purpose with or without fee is hereby granted.
272
+
273
+ THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH
274
+ REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY
275
+ AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT,
276
+ INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM
277
+ LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR
278
+ OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
279
+ PERFORMANCE OF THIS SOFTWARE.