adelevett commited on
Commit
046e3b8
·
verified ·
1 Parent(s): 32d365a

Upload 76 files

Browse files

Upload from localhost

This view is limited to 50 files because it contains too many changes.   See raw diff
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ spec/files/hardmode.pdf filter=lfs diff=lfs merge=lfs -text
.gitignore ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ dist
2
+ __pycache__
3
+ *.egg-info
4
+ *.aux
5
+ *.dvi
6
+ *.fdb_latexmk
7
+ *.fls
8
+ *.log
9
+ *.out
10
+
11
+ # User files
12
+ *.pdf
13
+ *.toc
14
+ recipe_debug.toml
CHANGELOG.md ADDED
@@ -0,0 +1,120 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Change log
2
+ ==========
3
+
4
+ pdf.tocgen 1.3.4
5
+ ----------------
6
+
7
+ Released November 25, 2023
8
+
9
+ - Add error messages for `--page` and invalid file
10
+ - Fix KeyError when extracting ToC from some PDFs with pdftocio
11
+
12
+ pdf.tocgen 1.3.3
13
+ ----------------
14
+
15
+ Released April 21, 2023
16
+
17
+ - Fix outdated dependencies
18
+ - Add vpos output for pdftocio
19
+ - Type stability enhancements
20
+
21
+ pdf.tocgen 1.3.2
22
+ ----------------
23
+
24
+ Released April 20, 2023
25
+
26
+ - Fix outdated build system
27
+
28
+ pdf.tocgen 1.3.1
29
+ ----------------
30
+
31
+ Released April 20, 2023
32
+
33
+ - Fix file encoding problems on Windows
34
+
35
+ pdf.tocgen 1.3.0
36
+ ----------------
37
+
38
+ Released November 10, 2021
39
+
40
+ - Fix deprecation warning from PyMuPDF
41
+
42
+ pdf.tocgen 1.2.3
43
+ ----------------
44
+
45
+ Released January 7, 2021
46
+
47
+ - Compatibility with PyMuPDF 1.18.6
48
+
49
+ pdf.tocgen 1.2.2
50
+ ----------------
51
+
52
+ Released October 11, 2020
53
+
54
+ - Compatibility with Python 3.9
55
+
56
+ pdf.tocgen 1.2.1
57
+ ----------------
58
+
59
+ Released August 7, 2020
60
+
61
+ - Fix a typo in the help message of `pdftocgen`.
62
+
63
+ pdf.tocgen 1.2.0
64
+ ----------------
65
+
66
+ Released August 7, 2020
67
+
68
+ - Swap out argparse in favor of getopt, which is much simpler and more
69
+ flexible.
70
+ - Now we could use `pdfxmeta doc.pdf` to dump an entire document, without the
71
+ empty pattern `""`.
72
+
73
+ pdf.tocgen 1.1.3
74
+ ----------------
75
+
76
+ Released August 4, 2020
77
+
78
+ - Usefully complain when tocparser can't parse an entry
79
+
80
+ pdf.tocgen 1.1.2
81
+ ----------------
82
+
83
+ Released August 3, 2020
84
+
85
+ - Add `--print` flag for `pdftocio` to force printing ToC.
86
+ - Add spec for cli commands.
87
+
88
+ pdf.tocgen 1.1.1
89
+ ----------------
90
+
91
+ Released July 31, 2020
92
+
93
+ - Add a `--auto` option for `pdfxmeta` to output a valid heading filter directly.
94
+
95
+ pdf.tocgen 1.1.0
96
+ ----------------
97
+
98
+ Released July 31, 2020
99
+
100
+ - Add a new option for a heading filter to be "greedy", which makes it extract
101
+ all the text in a block when at least one match occurs. This is extremely
102
+ useful for math-heavy documents.
103
+ - fixes the sorting problem with two column layout.
104
+
105
+ pdf.tocgen 1.0.1
106
+ ----------------
107
+
108
+ Released July 29, 2020
109
+
110
+ - Update documentations
111
+ - Fix some linter warnings
112
+ - Fix unicode problem in tests
113
+ - Some prep work for the next major release
114
+
115
+ pdf.tocgen 1.0.0
116
+ ----------------
117
+
118
+ Released July 28, 2020
119
+
120
+ - The first stable version
LICENSE ADDED
@@ -0,0 +1,674 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ GNU GENERAL PUBLIC LICENSE
2
+ Version 3, 29 June 2007
3
+
4
+ Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
5
+ Everyone is permitted to copy and distribute verbatim copies
6
+ of this license document, but changing it is not allowed.
7
+
8
+ Preamble
9
+
10
+ The GNU General Public License is a free, copyleft license for
11
+ software and other kinds of works.
12
+
13
+ The licenses for most software and other practical works are designed
14
+ to take away your freedom to share and change the works. By contrast,
15
+ the GNU General Public License is intended to guarantee your freedom to
16
+ share and change all versions of a program--to make sure it remains free
17
+ software for all its users. We, the Free Software Foundation, use the
18
+ GNU General Public License for most of our software; it applies also to
19
+ any other work released this way by its authors. You can apply it to
20
+ your programs, too.
21
+
22
+ When we speak of free software, we are referring to freedom, not
23
+ price. Our General Public Licenses are designed to make sure that you
24
+ have the freedom to distribute copies of free software (and charge for
25
+ them if you wish), that you receive source code or can get it if you
26
+ want it, that you can change the software or use pieces of it in new
27
+ free programs, and that you know you can do these things.
28
+
29
+ To protect your rights, we need to prevent others from denying you
30
+ these rights or asking you to surrender the rights. Therefore, you have
31
+ certain responsibilities if you distribute copies of the software, or if
32
+ you modify it: responsibilities to respect the freedom of others.
33
+
34
+ For example, if you distribute copies of such a program, whether
35
+ gratis or for a fee, you must pass on to the recipients the same
36
+ freedoms that you received. You must make sure that they, too, receive
37
+ or can get the source code. And you must show them these terms so they
38
+ know their rights.
39
+
40
+ Developers that use the GNU GPL protect your rights with two steps:
41
+ (1) assert copyright on the software, and (2) offer you this License
42
+ giving you legal permission to copy, distribute and/or modify it.
43
+
44
+ For the developers' and authors' protection, the GPL clearly explains
45
+ that there is no warranty for this free software. For both users' and
46
+ authors' sake, the GPL requires that modified versions be marked as
47
+ changed, so that their problems will not be attributed erroneously to
48
+ authors of previous versions.
49
+
50
+ Some devices are designed to deny users access to install or run
51
+ modified versions of the software inside them, although the manufacturer
52
+ can do so. This is fundamentally incompatible with the aim of
53
+ protecting users' freedom to change the software. The systematic
54
+ pattern of such abuse occurs in the area of products for individuals to
55
+ use, which is precisely where it is most unacceptable. Therefore, we
56
+ have designed this version of the GPL to prohibit the practice for those
57
+ products. If such problems arise substantially in other domains, we
58
+ stand ready to extend this provision to those domains in future versions
59
+ of the GPL, as needed to protect the freedom of users.
60
+
61
+ Finally, every program is threatened constantly by software patents.
62
+ States should not allow patents to restrict development and use of
63
+ software on general-purpose computers, but in those that do, we wish to
64
+ avoid the special danger that patents applied to a free program could
65
+ make it effectively proprietary. To prevent this, the GPL assures that
66
+ patents cannot be used to render the program non-free.
67
+
68
+ The precise terms and conditions for copying, distribution and
69
+ modification follow.
70
+
71
+ TERMS AND CONDITIONS
72
+
73
+ 0. Definitions.
74
+
75
+ "This License" refers to version 3 of the GNU General Public License.
76
+
77
+ "Copyright" also means copyright-like laws that apply to other kinds of
78
+ works, such as semiconductor masks.
79
+
80
+ "The Program" refers to any copyrightable work licensed under this
81
+ License. Each licensee is addressed as "you". "Licensees" and
82
+ "recipients" may be individuals or organizations.
83
+
84
+ To "modify" a work means to copy from or adapt all or part of the work
85
+ in a fashion requiring copyright permission, other than the making of an
86
+ exact copy. The resulting work is called a "modified version" of the
87
+ earlier work or a work "based on" the earlier work.
88
+
89
+ A "covered work" means either the unmodified Program or a work based
90
+ on the Program.
91
+
92
+ To "propagate" a work means to do anything with it that, without
93
+ permission, would make you directly or secondarily liable for
94
+ infringement under applicable copyright law, except executing it on a
95
+ computer or modifying a private copy. Propagation includes copying,
96
+ distribution (with or without modification), making available to the
97
+ public, and in some countries other activities as well.
98
+
99
+ To "convey" a work means any kind of propagation that enables other
100
+ parties to make or receive copies. Mere interaction with a user through
101
+ a computer network, with no transfer of a copy, is not conveying.
102
+
103
+ An interactive user interface displays "Appropriate Legal Notices"
104
+ to the extent that it includes a convenient and prominently visible
105
+ feature that (1) displays an appropriate copyright notice, and (2)
106
+ tells the user that there is no warranty for the work (except to the
107
+ extent that warranties are provided), that licensees may convey the
108
+ work under this License, and how to view a copy of this License. If
109
+ the interface presents a list of user commands or options, such as a
110
+ menu, a prominent item in the list meets this criterion.
111
+
112
+ 1. Source Code.
113
+
114
+ The "source code" for a work means the preferred form of the work
115
+ for making modifications to it. "Object code" means any non-source
116
+ form of a work.
117
+
118
+ A "Standard Interface" means an interface that either is an official
119
+ standard defined by a recognized standards body, or, in the case of
120
+ interfaces specified for a particular programming language, one that
121
+ is widely used among developers working in that language.
122
+
123
+ The "System Libraries" of an executable work include anything, other
124
+ than the work as a whole, that (a) is included in the normal form of
125
+ packaging a Major Component, but which is not part of that Major
126
+ Component, and (b) serves only to enable use of the work with that
127
+ Major Component, or to implement a Standard Interface for which an
128
+ implementation is available to the public in source code form. A
129
+ "Major Component", in this context, means a major essential component
130
+ (kernel, window system, and so on) of the specific operating system
131
+ (if any) on which the executable work runs, or a compiler used to
132
+ produce the work, or an object code interpreter used to run it.
133
+
134
+ The "Corresponding Source" for a work in object code form means all
135
+ the source code needed to generate, install, and (for an executable
136
+ work) run the object code and to modify the work, including scripts to
137
+ control those activities. However, it does not include the work's
138
+ System Libraries, or general-purpose tools or generally available free
139
+ programs which are used unmodified in performing those activities but
140
+ which are not part of the work. For example, Corresponding Source
141
+ includes interface definition files associated with source files for
142
+ the work, and the source code for shared libraries and dynamically
143
+ linked subprograms that the work is specifically designed to require,
144
+ such as by intimate data communication or control flow between those
145
+ subprograms and other parts of the work.
146
+
147
+ The Corresponding Source need not include anything that users
148
+ can regenerate automatically from other parts of the Corresponding
149
+ Source.
150
+
151
+ The Corresponding Source for a work in source code form is that
152
+ same work.
153
+
154
+ 2. Basic Permissions.
155
+
156
+ All rights granted under this License are granted for the term of
157
+ copyright on the Program, and are irrevocable provided the stated
158
+ conditions are met. This License explicitly affirms your unlimited
159
+ permission to run the unmodified Program. The output from running a
160
+ covered work is covered by this License only if the output, given its
161
+ content, constitutes a covered work. This License acknowledges your
162
+ rights of fair use or other equivalent, as provided by copyright law.
163
+
164
+ You may make, run and propagate covered works that you do not
165
+ convey, without conditions so long as your license otherwise remains
166
+ in force. You may convey covered works to others for the sole purpose
167
+ of having them make modifications exclusively for you, or provide you
168
+ with facilities for running those works, provided that you comply with
169
+ the terms of this License in conveying all material for which you do
170
+ not control copyright. Those thus making or running the covered works
171
+ for you must do so exclusively on your behalf, under your direction
172
+ and control, on terms that prohibit them from making any copies of
173
+ your copyrighted material outside their relationship with you.
174
+
175
+ Conveying under any other circumstances is permitted solely under
176
+ the conditions stated below. Sublicensing is not allowed; section 10
177
+ makes it unnecessary.
178
+
179
+ 3. Protecting Users' Legal Rights From Anti-Circumvention Law.
180
+
181
+ No covered work shall be deemed part of an effective technological
182
+ measure under any applicable law fulfilling obligations under article
183
+ 11 of the WIPO copyright treaty adopted on 20 December 1996, or
184
+ similar laws prohibiting or restricting circumvention of such
185
+ measures.
186
+
187
+ When you convey a covered work, you waive any legal power to forbid
188
+ circumvention of technological measures to the extent such circumvention
189
+ is effected by exercising rights under this License with respect to
190
+ the covered work, and you disclaim any intention to limit operation or
191
+ modification of the work as a means of enforcing, against the work's
192
+ users, your or third parties' legal rights to forbid circumvention of
193
+ technological measures.
194
+
195
+ 4. Conveying Verbatim Copies.
196
+
197
+ You may convey verbatim copies of the Program's source code as you
198
+ receive it, in any medium, provided that you conspicuously and
199
+ appropriately publish on each copy an appropriate copyright notice;
200
+ keep intact all notices stating that this License and any
201
+ non-permissive terms added in accord with section 7 apply to the code;
202
+ keep intact all notices of the absence of any warranty; and give all
203
+ recipients a copy of this License along with the Program.
204
+
205
+ You may charge any price or no price for each copy that you convey,
206
+ and you may offer support or warranty protection for a fee.
207
+
208
+ 5. Conveying Modified Source Versions.
209
+
210
+ You may convey a work based on the Program, or the modifications to
211
+ produce it from the Program, in the form of source code under the
212
+ terms of section 4, provided that you also meet all of these conditions:
213
+
214
+ a) The work must carry prominent notices stating that you modified
215
+ it, and giving a relevant date.
216
+
217
+ b) The work must carry prominent notices stating that it is
218
+ released under this License and any conditions added under section
219
+ 7. This requirement modifies the requirement in section 4 to
220
+ "keep intact all notices".
221
+
222
+ c) You must license the entire work, as a whole, under this
223
+ License to anyone who comes into possession of a copy. This
224
+ License will therefore apply, along with any applicable section 7
225
+ additional terms, to the whole of the work, and all its parts,
226
+ regardless of how they are packaged. This License gives no
227
+ permission to license the work in any other way, but it does not
228
+ invalidate such permission if you have separately received it.
229
+
230
+ d) If the work has interactive user interfaces, each must display
231
+ Appropriate Legal Notices; however, if the Program has interactive
232
+ interfaces that do not display Appropriate Legal Notices, your
233
+ work need not make them do so.
234
+
235
+ A compilation of a covered work with other separate and independent
236
+ works, which are not by their nature extensions of the covered work,
237
+ and which are not combined with it such as to form a larger program,
238
+ in or on a volume of a storage or distribution medium, is called an
239
+ "aggregate" if the compilation and its resulting copyright are not
240
+ used to limit the access or legal rights of the compilation's users
241
+ beyond what the individual works permit. Inclusion of a covered work
242
+ in an aggregate does not cause this License to apply to the other
243
+ parts of the aggregate.
244
+
245
+ 6. Conveying Non-Source Forms.
246
+
247
+ You may convey a covered work in object code form under the terms
248
+ of sections 4 and 5, provided that you also convey the
249
+ machine-readable Corresponding Source under the terms of this License,
250
+ in one of these ways:
251
+
252
+ a) Convey the object code in, or embodied in, a physical product
253
+ (including a physical distribution medium), accompanied by the
254
+ Corresponding Source fixed on a durable physical medium
255
+ customarily used for software interchange.
256
+
257
+ b) Convey the object code in, or embodied in, a physical product
258
+ (including a physical distribution medium), accompanied by a
259
+ written offer, valid for at least three years and valid for as
260
+ long as you offer spare parts or customer support for that product
261
+ model, to give anyone who possesses the object code either (1) a
262
+ copy of the Corresponding Source for all the software in the
263
+ product that is covered by this License, on a durable physical
264
+ medium customarily used for software interchange, for a price no
265
+ more than your reasonable cost of physically performing this
266
+ conveying of source, or (2) access to copy the
267
+ Corresponding Source from a network server at no charge.
268
+
269
+ c) Convey individual copies of the object code with a copy of the
270
+ written offer to provide the Corresponding Source. This
271
+ alternative is allowed only occasionally and noncommercially, and
272
+ only if you received the object code with such an offer, in accord
273
+ with subsection 6b.
274
+
275
+ d) Convey the object code by offering access from a designated
276
+ place (gratis or for a charge), and offer equivalent access to the
277
+ Corresponding Source in the same way through the same place at no
278
+ further charge. You need not require recipients to copy the
279
+ Corresponding Source along with the object code. If the place to
280
+ copy the object code is a network server, the Corresponding Source
281
+ may be on a different server (operated by you or a third party)
282
+ that supports equivalent copying facilities, provided you maintain
283
+ clear directions next to the object code saying where to find the
284
+ Corresponding Source. Regardless of what server hosts the
285
+ Corresponding Source, you remain obligated to ensure that it is
286
+ available for as long as needed to satisfy these requirements.
287
+
288
+ e) Convey the object code using peer-to-peer transmission, provided
289
+ you inform other peers where the object code and Corresponding
290
+ Source of the work are being offered to the general public at no
291
+ charge under subsection 6d.
292
+
293
+ A separable portion of the object code, whose source code is excluded
294
+ from the Corresponding Source as a System Library, need not be
295
+ included in conveying the object code work.
296
+
297
+ A "User Product" is either (1) a "consumer product", which means any
298
+ tangible personal property which is normally used for personal, family,
299
+ or household purposes, or (2) anything designed or sold for incorporation
300
+ into a dwelling. In determining whether a product is a consumer product,
301
+ doubtful cases shall be resolved in favor of coverage. For a particular
302
+ product received by a particular user, "normally used" refers to a
303
+ typical or common use of that class of product, regardless of the status
304
+ of the particular user or of the way in which the particular user
305
+ actually uses, or expects or is expected to use, the product. A product
306
+ is a consumer product regardless of whether the product has substantial
307
+ commercial, industrial or non-consumer uses, unless such uses represent
308
+ the only significant mode of use of the product.
309
+
310
+ "Installation Information" for a User Product means any methods,
311
+ procedures, authorization keys, or other information required to install
312
+ and execute modified versions of a covered work in that User Product from
313
+ a modified version of its Corresponding Source. The information must
314
+ suffice to ensure that the continued functioning of the modified object
315
+ code is in no case prevented or interfered with solely because
316
+ modification has been made.
317
+
318
+ If you convey an object code work under this section in, or with, or
319
+ specifically for use in, a User Product, and the conveying occurs as
320
+ part of a transaction in which the right of possession and use of the
321
+ User Product is transferred to the recipient in perpetuity or for a
322
+ fixed term (regardless of how the transaction is characterized), the
323
+ Corresponding Source conveyed under this section must be accompanied
324
+ by the Installation Information. But this requirement does not apply
325
+ if neither you nor any third party retains the ability to install
326
+ modified object code on the User Product (for example, the work has
327
+ been installed in ROM).
328
+
329
+ The requirement to provide Installation Information does not include a
330
+ requirement to continue to provide support service, warranty, or updates
331
+ for a work that has been modified or installed by the recipient, or for
332
+ the User Product in which it has been modified or installed. Access to a
333
+ network may be denied when the modification itself materially and
334
+ adversely affects the operation of the network or violates the rules and
335
+ protocols for communication across the network.
336
+
337
+ Corresponding Source conveyed, and Installation Information provided,
338
+ in accord with this section must be in a format that is publicly
339
+ documented (and with an implementation available to the public in
340
+ source code form), and must require no special password or key for
341
+ unpacking, reading or copying.
342
+
343
+ 7. Additional Terms.
344
+
345
+ "Additional permissions" are terms that supplement the terms of this
346
+ License by making exceptions from one or more of its conditions.
347
+ Additional permissions that are applicable to the entire Program shall
348
+ be treated as though they were included in this License, to the extent
349
+ that they are valid under applicable law. If additional permissions
350
+ apply only to part of the Program, that part may be used separately
351
+ under those permissions, but the entire Program remains governed by
352
+ this License without regard to the additional permissions.
353
+
354
+ When you convey a copy of a covered work, you may at your option
355
+ remove any additional permissions from that copy, or from any part of
356
+ it. (Additional permissions may be written to require their own
357
+ removal in certain cases when you modify the work.) You may place
358
+ additional permissions on material, added by you to a covered work,
359
+ for which you have or can give appropriate copyright permission.
360
+
361
+ Notwithstanding any other provision of this License, for material you
362
+ add to a covered work, you may (if authorized by the copyright holders of
363
+ that material) supplement the terms of this License with terms:
364
+
365
+ a) Disclaiming warranty or limiting liability differently from the
366
+ terms of sections 15 and 16 of this License; or
367
+
368
+ b) Requiring preservation of specified reasonable legal notices or
369
+ author attributions in that material or in the Appropriate Legal
370
+ Notices displayed by works containing it; or
371
+
372
+ c) Prohibiting misrepresentation of the origin of that material, or
373
+ requiring that modified versions of such material be marked in
374
+ reasonable ways as different from the original version; or
375
+
376
+ d) Limiting the use for publicity purposes of names of licensors or
377
+ authors of the material; or
378
+
379
+ e) Declining to grant rights under trademark law for use of some
380
+ trade names, trademarks, or service marks; or
381
+
382
+ f) Requiring indemnification of licensors and authors of that
383
+ material by anyone who conveys the material (or modified versions of
384
+ it) with contractual assumptions of liability to the recipient, for
385
+ any liability that these contractual assumptions directly impose on
386
+ those licensors and authors.
387
+
388
+ All other non-permissive additional terms are considered "further
389
+ restrictions" within the meaning of section 10. If the Program as you
390
+ received it, or any part of it, contains a notice stating that it is
391
+ governed by this License along with a term that is a further
392
+ restriction, you may remove that term. If a license document contains
393
+ a further restriction but permits relicensing or conveying under this
394
+ License, you may add to a covered work material governed by the terms
395
+ of that license document, provided that the further restriction does
396
+ not survive such relicensing or conveying.
397
+
398
+ If you add terms to a covered work in accord with this section, you
399
+ must place, in the relevant source files, a statement of the
400
+ additional terms that apply to those files, or a notice indicating
401
+ where to find the applicable terms.
402
+
403
+ Additional terms, permissive or non-permissive, may be stated in the
404
+ form of a separately written license, or stated as exceptions;
405
+ the above requirements apply either way.
406
+
407
+ 8. Termination.
408
+
409
+ You may not propagate or modify a covered work except as expressly
410
+ provided under this License. Any attempt otherwise to propagate or
411
+ modify it is void, and will automatically terminate your rights under
412
+ this License (including any patent licenses granted under the third
413
+ paragraph of section 11).
414
+
415
+ However, if you cease all violation of this License, then your
416
+ license from a particular copyright holder is reinstated (a)
417
+ provisionally, unless and until the copyright holder explicitly and
418
+ finally terminates your license, and (b) permanently, if the copyright
419
+ holder fails to notify you of the violation by some reasonable means
420
+ prior to 60 days after the cessation.
421
+
422
+ Moreover, your license from a particular copyright holder is
423
+ reinstated permanently if the copyright holder notifies you of the
424
+ violation by some reasonable means, this is the first time you have
425
+ received notice of violation of this License (for any work) from that
426
+ copyright holder, and you cure the violation prior to 30 days after
427
+ your receipt of the notice.
428
+
429
+ Termination of your rights under this section does not terminate the
430
+ licenses of parties who have received copies or rights from you under
431
+ this License. If your rights have been terminated and not permanently
432
+ reinstated, you do not qualify to receive new licenses for the same
433
+ material under section 10.
434
+
435
+ 9. Acceptance Not Required for Having Copies.
436
+
437
+ You are not required to accept this License in order to receive or
438
+ run a copy of the Program. Ancillary propagation of a covered work
439
+ occurring solely as a consequence of using peer-to-peer transmission
440
+ to receive a copy likewise does not require acceptance. However,
441
+ nothing other than this License grants you permission to propagate or
442
+ modify any covered work. These actions infringe copyright if you do
443
+ not accept this License. Therefore, by modifying or propagating a
444
+ covered work, you indicate your acceptance of this License to do so.
445
+
446
+ 10. Automatic Licensing of Downstream Recipients.
447
+
448
+ Each time you convey a covered work, the recipient automatically
449
+ receives a license from the original licensors, to run, modify and
450
+ propagate that work, subject to this License. You are not responsible
451
+ for enforcing compliance by third parties with this License.
452
+
453
+ An "entity transaction" is a transaction transferring control of an
454
+ organization, or substantially all assets of one, or subdividing an
455
+ organization, or merging organizations. If propagation of a covered
456
+ work results from an entity transaction, each party to that
457
+ transaction who receives a copy of the work also receives whatever
458
+ licenses to the work the party's predecessor in interest had or could
459
+ give under the previous paragraph, plus a right to possession of the
460
+ Corresponding Source of the work from the predecessor in interest, if
461
+ the predecessor has it or can get it with reasonable efforts.
462
+
463
+ You may not impose any further restrictions on the exercise of the
464
+ rights granted or affirmed under this License. For example, you may
465
+ not impose a license fee, royalty, or other charge for exercise of
466
+ rights granted under this License, and you may not initiate litigation
467
+ (including a cross-claim or counterclaim in a lawsuit) alleging that
468
+ any patent claim is infringed by making, using, selling, offering for
469
+ sale, or importing the Program or any portion of it.
470
+
471
+ 11. Patents.
472
+
473
+ A "contributor" is a copyright holder who authorizes use under this
474
+ License of the Program or a work on which the Program is based. The
475
+ work thus licensed is called the contributor's "contributor version".
476
+
477
+ A contributor's "essential patent claims" are all patent claims
478
+ owned or controlled by the contributor, whether already acquired or
479
+ hereafter acquired, that would be infringed by some manner, permitted
480
+ by this License, of making, using, or selling its contributor version,
481
+ but do not include claims that would be infringed only as a
482
+ consequence of further modification of the contributor version. For
483
+ purposes of this definition, "control" includes the right to grant
484
+ patent sublicenses in a manner consistent with the requirements of
485
+ this License.
486
+
487
+ Each contributor grants you a non-exclusive, worldwide, royalty-free
488
+ patent license under the contributor's essential patent claims, to
489
+ make, use, sell, offer for sale, import and otherwise run, modify and
490
+ propagate the contents of its contributor version.
491
+
492
+ In the following three paragraphs, a "patent license" is any express
493
+ agreement or commitment, however denominated, not to enforce a patent
494
+ (such as an express permission to practice a patent or covenant not to
495
+ sue for patent infringement). To "grant" such a patent license to a
496
+ party means to make such an agreement or commitment not to enforce a
497
+ patent against the party.
498
+
499
+ If you convey a covered work, knowingly relying on a patent license,
500
+ and the Corresponding Source of the work is not available for anyone
501
+ to copy, free of charge and under the terms of this License, through a
502
+ publicly available network server or other readily accessible means,
503
+ then you must either (1) cause the Corresponding Source to be so
504
+ available, or (2) arrange to deprive yourself of the benefit of the
505
+ patent license for this particular work, or (3) arrange, in a manner
506
+ consistent with the requirements of this License, to extend the patent
507
+ license to downstream recipients. "Knowingly relying" means you have
508
+ actual knowledge that, but for the patent license, your conveying the
509
+ covered work in a country, or your recipient's use of the covered work
510
+ in a country, would infringe one or more identifiable patents in that
511
+ country that you have reason to believe are valid.
512
+
513
+ If, pursuant to or in connection with a single transaction or
514
+ arrangement, you convey, or propagate by procuring conveyance of, a
515
+ covered work, and grant a patent license to some of the parties
516
+ receiving the covered work authorizing them to use, propagate, modify
517
+ or convey a specific copy of the covered work, then the patent license
518
+ you grant is automatically extended to all recipients of the covered
519
+ work and works based on it.
520
+
521
+ A patent license is "discriminatory" if it does not include within
522
+ the scope of its coverage, prohibits the exercise of, or is
523
+ conditioned on the non-exercise of one or more of the rights that are
524
+ specifically granted under this License. You may not convey a covered
525
+ work if you are a party to an arrangement with a third party that is
526
+ in the business of distributing software, under which you make payment
527
+ to the third party based on the extent of your activity of conveying
528
+ the work, and under which the third party grants, to any of the
529
+ parties who would receive the covered work from you, a discriminatory
530
+ patent license (a) in connection with copies of the covered work
531
+ conveyed by you (or copies made from those copies), or (b) primarily
532
+ for and in connection with specific products or compilations that
533
+ contain the covered work, unless you entered into that arrangement,
534
+ or that patent license was granted, prior to 28 March 2007.
535
+
536
+ Nothing in this License shall be construed as excluding or limiting
537
+ any implied license or other defenses to infringement that may
538
+ otherwise be available to you under applicable patent law.
539
+
540
+ 12. No Surrender of Others' Freedom.
541
+
542
+ If conditions are imposed on you (whether by court order, agreement or
543
+ otherwise) that contradict the conditions of this License, they do not
544
+ excuse you from the conditions of this License. If you cannot convey a
545
+ covered work so as to satisfy simultaneously your obligations under this
546
+ License and any other pertinent obligations, then as a consequence you may
547
+ not convey it at all. For example, if you agree to terms that obligate you
548
+ to collect a royalty for further conveying from those to whom you convey
549
+ the Program, the only way you could satisfy both those terms and this
550
+ License would be to refrain entirely from conveying the Program.
551
+
552
+ 13. Use with the GNU Affero General Public License.
553
+
554
+ Notwithstanding any other provision of this License, you have
555
+ permission to link or combine any covered work with a work licensed
556
+ under version 3 of the GNU Affero General Public License into a single
557
+ combined work, and to convey the resulting work. The terms of this
558
+ License will continue to apply to the part which is the covered work,
559
+ but the special requirements of the GNU Affero General Public License,
560
+ section 13, concerning interaction through a network will apply to the
561
+ combination as such.
562
+
563
+ 14. Revised Versions of this License.
564
+
565
+ The Free Software Foundation may publish revised and/or new versions of
566
+ the GNU General Public License from time to time. Such new versions will
567
+ be similar in spirit to the present version, but may differ in detail to
568
+ address new problems or concerns.
569
+
570
+ Each version is given a distinguishing version number. If the
571
+ Program specifies that a certain numbered version of the GNU General
572
+ Public License "or any later version" applies to it, you have the
573
+ option of following the terms and conditions either of that numbered
574
+ version or of any later version published by the Free Software
575
+ Foundation. If the Program does not specify a version number of the
576
+ GNU General Public License, you may choose any version ever published
577
+ by the Free Software Foundation.
578
+
579
+ If the Program specifies that a proxy can decide which future
580
+ versions of the GNU General Public License can be used, that proxy's
581
+ public statement of acceptance of a version permanently authorizes you
582
+ to choose that version for the Program.
583
+
584
+ Later license versions may give you additional or different
585
+ permissions. However, no additional obligations are imposed on any
586
+ author or copyright holder as a result of your choosing to follow a
587
+ later version.
588
+
589
+ 15. Disclaimer of Warranty.
590
+
591
+ THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
592
+ APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
593
+ HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
594
+ OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
595
+ THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
596
+ PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
597
+ IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
598
+ ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
599
+
600
+ 16. Limitation of Liability.
601
+
602
+ IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
603
+ WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
604
+ THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
605
+ GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
606
+ USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
607
+ DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
608
+ PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
609
+ EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
610
+ SUCH DAMAGES.
611
+
612
+ 17. Interpretation of Sections 15 and 16.
613
+
614
+ If the disclaimer of warranty and limitation of liability provided
615
+ above cannot be given local legal effect according to their terms,
616
+ reviewing courts shall apply local law that most closely approximates
617
+ an absolute waiver of all civil liability in connection with the
618
+ Program, unless a warranty or assumption of liability accompanies a
619
+ copy of the Program in return for a fee.
620
+
621
+ END OF TERMS AND CONDITIONS
622
+
623
+ How to Apply These Terms to Your New Programs
624
+
625
+ If you develop a new program, and you want it to be of the greatest
626
+ possible use to the public, the best way to achieve this is to make it
627
+ free software which everyone can redistribute and change under these terms.
628
+
629
+ To do so, attach the following notices to the program. It is safest
630
+ to attach them to the start of each source file to most effectively
631
+ state the exclusion of warranty; and each file should have at least
632
+ the "copyright" line and a pointer to where the full notice is found.
633
+
634
+ <one line to give the program's name and a brief idea of what it does.>
635
+ Copyright (C) <year> <name of author>
636
+
637
+ This program is free software: you can redistribute it and/or modify
638
+ it under the terms of the GNU General Public License as published by
639
+ the Free Software Foundation, either version 3 of the License, or
640
+ (at your option) any later version.
641
+
642
+ This program is distributed in the hope that it will be useful,
643
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
644
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
645
+ GNU General Public License for more details.
646
+
647
+ You should have received a copy of the GNU General Public License
648
+ along with this program. If not, see <https://www.gnu.org/licenses/>.
649
+
650
+ Also add information on how to contact you by electronic and paper mail.
651
+
652
+ If the program does terminal interaction, make it output a short
653
+ notice like this when it starts in an interactive mode:
654
+
655
+ <program> Copyright (C) <year> <name of author>
656
+ This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
657
+ This is free software, and you are welcome to redistribute it
658
+ under certain conditions; type `show c' for details.
659
+
660
+ The hypothetical commands `show w' and `show c' should show the appropriate
661
+ parts of the General Public License. Of course, your program's commands
662
+ might be different; for a GUI interface, you would use an "about box".
663
+
664
+ You should also get your employer (if you work as a programmer) or school,
665
+ if any, to sign a "copyright disclaimer" for the program, if necessary.
666
+ For more information on this, and how to apply and follow the GNU GPL, see
667
+ <https://www.gnu.org/licenses/>.
668
+
669
+ The GNU General Public License does not permit incorporating your program
670
+ into proprietary programs. If your program is a subroutine library, you
671
+ may consider it more useful to permit linking proprietary applications with
672
+ the library. If this is what you want to do, use the GNU Lesser General
673
+ Public License instead of this License. But first, please read
674
+ <https://www.gnu.org/licenses/why-not-lgpl.html>.
LICENSE_AGPL ADDED
@@ -0,0 +1,661 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ GNU AFFERO GENERAL PUBLIC LICENSE
2
+ Version 3, 19 November 2007
3
+
4
+ Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
5
+ Everyone is permitted to copy and distribute verbatim copies
6
+ of this license document, but changing it is not allowed.
7
+
8
+ Preamble
9
+
10
+ The GNU Affero General Public License is a free, copyleft license for
11
+ software and other kinds of works, specifically designed to ensure
12
+ cooperation with the community in the case of network server software.
13
+
14
+ The licenses for most software and other practical works are designed
15
+ to take away your freedom to share and change the works. By contrast,
16
+ our General Public Licenses are intended to guarantee your freedom to
17
+ share and change all versions of a program--to make sure it remains free
18
+ software for all its users.
19
+
20
+ When we speak of free software, we are referring to freedom, not
21
+ price. Our General Public Licenses are designed to make sure that you
22
+ have the freedom to distribute copies of free software (and charge for
23
+ them if you wish), that you receive source code or can get it if you
24
+ want it, that you can change the software or use pieces of it in new
25
+ free programs, and that you know you can do these things.
26
+
27
+ Developers that use our General Public Licenses protect your rights
28
+ with two steps: (1) assert copyright on the software, and (2) offer
29
+ you this License which gives you legal permission to copy, distribute
30
+ and/or modify the software.
31
+
32
+ A secondary benefit of defending all users' freedom is that
33
+ improvements made in alternate versions of the program, if they
34
+ receive widespread use, become available for other developers to
35
+ incorporate. Many developers of free software are heartened and
36
+ encouraged by the resulting cooperation. However, in the case of
37
+ software used on network servers, this result may fail to come about.
38
+ The GNU General Public License permits making a modified version and
39
+ letting the public access it on a server without ever releasing its
40
+ source code to the public.
41
+
42
+ The GNU Affero General Public License is designed specifically to
43
+ ensure that, in such cases, the modified source code becomes available
44
+ to the community. It requires the operator of a network server to
45
+ provide the source code of the modified version running there to the
46
+ users of that server. Therefore, public use of a modified version, on
47
+ a publicly accessible server, gives the public access to the source
48
+ code of the modified version.
49
+
50
+ An older license, called the Affero General Public License and
51
+ published by Affero, was designed to accomplish similar goals. This is
52
+ a different license, not a version of the Affero GPL, but Affero has
53
+ released a new version of the Affero GPL which permits relicensing under
54
+ this license.
55
+
56
+ The precise terms and conditions for copying, distribution and
57
+ modification follow.
58
+
59
+ TERMS AND CONDITIONS
60
+
61
+ 0. Definitions.
62
+
63
+ "This License" refers to version 3 of the GNU Affero General Public License.
64
+
65
+ "Copyright" also means copyright-like laws that apply to other kinds of
66
+ works, such as semiconductor masks.
67
+
68
+ "The Program" refers to any copyrightable work licensed under this
69
+ License. Each licensee is addressed as "you". "Licensees" and
70
+ "recipients" may be individuals or organizations.
71
+
72
+ To "modify" a work means to copy from or adapt all or part of the work
73
+ in a fashion requiring copyright permission, other than the making of an
74
+ exact copy. The resulting work is called a "modified version" of the
75
+ earlier work or a work "based on" the earlier work.
76
+
77
+ A "covered work" means either the unmodified Program or a work based
78
+ on the Program.
79
+
80
+ To "propagate" a work means to do anything with it that, without
81
+ permission, would make you directly or secondarily liable for
82
+ infringement under applicable copyright law, except executing it on a
83
+ computer or modifying a private copy. Propagation includes copying,
84
+ distribution (with or without modification), making available to the
85
+ public, and in some countries other activities as well.
86
+
87
+ To "convey" a work means any kind of propagation that enables other
88
+ parties to make or receive copies. Mere interaction with a user through
89
+ a computer network, with no transfer of a copy, is not conveying.
90
+
91
+ An interactive user interface displays "Appropriate Legal Notices"
92
+ to the extent that it includes a convenient and prominently visible
93
+ feature that (1) displays an appropriate copyright notice, and (2)
94
+ tells the user that there is no warranty for the work (except to the
95
+ extent that warranties are provided), that licensees may convey the
96
+ work under this License, and how to view a copy of this License. If
97
+ the interface presents a list of user commands or options, such as a
98
+ menu, a prominent item in the list meets this criterion.
99
+
100
+ 1. Source Code.
101
+
102
+ The "source code" for a work means the preferred form of the work
103
+ for making modifications to it. "Object code" means any non-source
104
+ form of a work.
105
+
106
+ A "Standard Interface" means an interface that either is an official
107
+ standard defined by a recognized standards body, or, in the case of
108
+ interfaces specified for a particular programming language, one that
109
+ is widely used among developers working in that language.
110
+
111
+ The "System Libraries" of an executable work include anything, other
112
+ than the work as a whole, that (a) is included in the normal form of
113
+ packaging a Major Component, but which is not part of that Major
114
+ Component, and (b) serves only to enable use of the work with that
115
+ Major Component, or to implement a Standard Interface for which an
116
+ implementation is available to the public in source code form. A
117
+ "Major Component", in this context, means a major essential component
118
+ (kernel, window system, and so on) of the specific operating system
119
+ (if any) on which the executable work runs, or a compiler used to
120
+ produce the work, or an object code interpreter used to run it.
121
+
122
+ The "Corresponding Source" for a work in object code form means all
123
+ the source code needed to generate, install, and (for an executable
124
+ work) run the object code and to modify the work, including scripts to
125
+ control those activities. However, it does not include the work's
126
+ System Libraries, or general-purpose tools or generally available free
127
+ programs which are used unmodified in performing those activities but
128
+ which are not part of the work. For example, Corresponding Source
129
+ includes interface definition files associated with source files for
130
+ the work, and the source code for shared libraries and dynamically
131
+ linked subprograms that the work is specifically designed to require,
132
+ such as by intimate data communication or control flow between those
133
+ subprograms and other parts of the work.
134
+
135
+ The Corresponding Source need not include anything that users
136
+ can regenerate automatically from other parts of the Corresponding
137
+ Source.
138
+
139
+ The Corresponding Source for a work in source code form is that
140
+ same work.
141
+
142
+ 2. Basic Permissions.
143
+
144
+ All rights granted under this License are granted for the term of
145
+ copyright on the Program, and are irrevocable provided the stated
146
+ conditions are met. This License explicitly affirms your unlimited
147
+ permission to run the unmodified Program. The output from running a
148
+ covered work is covered by this License only if the output, given its
149
+ content, constitutes a covered work. This License acknowledges your
150
+ rights of fair use or other equivalent, as provided by copyright law.
151
+
152
+ You may make, run and propagate covered works that you do not
153
+ convey, without conditions so long as your license otherwise remains
154
+ in force. You may convey covered works to others for the sole purpose
155
+ of having them make modifications exclusively for you, or provide you
156
+ with facilities for running those works, provided that you comply with
157
+ the terms of this License in conveying all material for which you do
158
+ not control copyright. Those thus making or running the covered works
159
+ for you must do so exclusively on your behalf, under your direction
160
+ and control, on terms that prohibit them from making any copies of
161
+ your copyrighted material outside their relationship with you.
162
+
163
+ Conveying under any other circumstances is permitted solely under
164
+ the conditions stated below. Sublicensing is not allowed; section 10
165
+ makes it unnecessary.
166
+
167
+ 3. Protecting Users' Legal Rights From Anti-Circumvention Law.
168
+
169
+ No covered work shall be deemed part of an effective technological
170
+ measure under any applicable law fulfilling obligations under article
171
+ 11 of the WIPO copyright treaty adopted on 20 December 1996, or
172
+ similar laws prohibiting or restricting circumvention of such
173
+ measures.
174
+
175
+ When you convey a covered work, you waive any legal power to forbid
176
+ circumvention of technological measures to the extent such circumvention
177
+ is effected by exercising rights under this License with respect to
178
+ the covered work, and you disclaim any intention to limit operation or
179
+ modification of the work as a means of enforcing, against the work's
180
+ users, your or third parties' legal rights to forbid circumvention of
181
+ technological measures.
182
+
183
+ 4. Conveying Verbatim Copies.
184
+
185
+ You may convey verbatim copies of the Program's source code as you
186
+ receive it, in any medium, provided that you conspicuously and
187
+ appropriately publish on each copy an appropriate copyright notice;
188
+ keep intact all notices stating that this License and any
189
+ non-permissive terms added in accord with section 7 apply to the code;
190
+ keep intact all notices of the absence of any warranty; and give all
191
+ recipients a copy of this License along with the Program.
192
+
193
+ You may charge any price or no price for each copy that you convey,
194
+ and you may offer support or warranty protection for a fee.
195
+
196
+ 5. Conveying Modified Source Versions.
197
+
198
+ You may convey a work based on the Program, or the modifications to
199
+ produce it from the Program, in the form of source code under the
200
+ terms of section 4, provided that you also meet all of these conditions:
201
+
202
+ a) The work must carry prominent notices stating that you modified
203
+ it, and giving a relevant date.
204
+
205
+ b) The work must carry prominent notices stating that it is
206
+ released under this License and any conditions added under section
207
+ 7. This requirement modifies the requirement in section 4 to
208
+ "keep intact all notices".
209
+
210
+ c) You must license the entire work, as a whole, under this
211
+ License to anyone who comes into possession of a copy. This
212
+ License will therefore apply, along with any applicable section 7
213
+ additional terms, to the whole of the work, and all its parts,
214
+ regardless of how they are packaged. This License gives no
215
+ permission to license the work in any other way, but it does not
216
+ invalidate such permission if you have separately received it.
217
+
218
+ d) If the work has interactive user interfaces, each must display
219
+ Appropriate Legal Notices; however, if the Program has interactive
220
+ interfaces that do not display Appropriate Legal Notices, your
221
+ work need not make them do so.
222
+
223
+ A compilation of a covered work with other separate and independent
224
+ works, which are not by their nature extensions of the covered work,
225
+ and which are not combined with it such as to form a larger program,
226
+ in or on a volume of a storage or distribution medium, is called an
227
+ "aggregate" if the compilation and its resulting copyright are not
228
+ used to limit the access or legal rights of the compilation's users
229
+ beyond what the individual works permit. Inclusion of a covered work
230
+ in an aggregate does not cause this License to apply to the other
231
+ parts of the aggregate.
232
+
233
+ 6. Conveying Non-Source Forms.
234
+
235
+ You may convey a covered work in object code form under the terms
236
+ of sections 4 and 5, provided that you also convey the
237
+ machine-readable Corresponding Source under the terms of this License,
238
+ in one of these ways:
239
+
240
+ a) Convey the object code in, or embodied in, a physical product
241
+ (including a physical distribution medium), accompanied by the
242
+ Corresponding Source fixed on a durable physical medium
243
+ customarily used for software interchange.
244
+
245
+ b) Convey the object code in, or embodied in, a physical product
246
+ (including a physical distribution medium), accompanied by a
247
+ written offer, valid for at least three years and valid for as
248
+ long as you offer spare parts or customer support for that product
249
+ model, to give anyone who possesses the object code either (1) a
250
+ copy of the Corresponding Source for all the software in the
251
+ product that is covered by this License, on a durable physical
252
+ medium customarily used for software interchange, for a price no
253
+ more than your reasonable cost of physically performing this
254
+ conveying of source, or (2) access to copy the
255
+ Corresponding Source from a network server at no charge.
256
+
257
+ c) Convey individual copies of the object code with a copy of the
258
+ written offer to provide the Corresponding Source. This
259
+ alternative is allowed only occasionally and noncommercially, and
260
+ only if you received the object code with such an offer, in accord
261
+ with subsection 6b.
262
+
263
+ d) Convey the object code by offering access from a designated
264
+ place (gratis or for a charge), and offer equivalent access to the
265
+ Corresponding Source in the same way through the same place at no
266
+ further charge. You need not require recipients to copy the
267
+ Corresponding Source along with the object code. If the place to
268
+ copy the object code is a network server, the Corresponding Source
269
+ may be on a different server (operated by you or a third party)
270
+ that supports equivalent copying facilities, provided you maintain
271
+ clear directions next to the object code saying where to find the
272
+ Corresponding Source. Regardless of what server hosts the
273
+ Corresponding Source, you remain obligated to ensure that it is
274
+ available for as long as needed to satisfy these requirements.
275
+
276
+ e) Convey the object code using peer-to-peer transmission, provided
277
+ you inform other peers where the object code and Corresponding
278
+ Source of the work are being offered to the general public at no
279
+ charge under subsection 6d.
280
+
281
+ A separable portion of the object code, whose source code is excluded
282
+ from the Corresponding Source as a System Library, need not be
283
+ included in conveying the object code work.
284
+
285
+ A "User Product" is either (1) a "consumer product", which means any
286
+ tangible personal property which is normally used for personal, family,
287
+ or household purposes, or (2) anything designed or sold for incorporation
288
+ into a dwelling. In determining whether a product is a consumer product,
289
+ doubtful cases shall be resolved in favor of coverage. For a particular
290
+ product received by a particular user, "normally used" refers to a
291
+ typical or common use of that class of product, regardless of the status
292
+ of the particular user or of the way in which the particular user
293
+ actually uses, or expects or is expected to use, the product. A product
294
+ is a consumer product regardless of whether the product has substantial
295
+ commercial, industrial or non-consumer uses, unless such uses represent
296
+ the only significant mode of use of the product.
297
+
298
+ "Installation Information" for a User Product means any methods,
299
+ procedures, authorization keys, or other information required to install
300
+ and execute modified versions of a covered work in that User Product from
301
+ a modified version of its Corresponding Source. The information must
302
+ suffice to ensure that the continued functioning of the modified object
303
+ code is in no case prevented or interfered with solely because
304
+ modification has been made.
305
+
306
+ If you convey an object code work under this section in, or with, or
307
+ specifically for use in, a User Product, and the conveying occurs as
308
+ part of a transaction in which the right of possession and use of the
309
+ User Product is transferred to the recipient in perpetuity or for a
310
+ fixed term (regardless of how the transaction is characterized), the
311
+ Corresponding Source conveyed under this section must be accompanied
312
+ by the Installation Information. But this requirement does not apply
313
+ if neither you nor any third party retains the ability to install
314
+ modified object code on the User Product (for example, the work has
315
+ been installed in ROM).
316
+
317
+ The requirement to provide Installation Information does not include a
318
+ requirement to continue to provide support service, warranty, or updates
319
+ for a work that has been modified or installed by the recipient, or for
320
+ the User Product in which it has been modified or installed. Access to a
321
+ network may be denied when the modification itself materially and
322
+ adversely affects the operation of the network or violates the rules and
323
+ protocols for communication across the network.
324
+
325
+ Corresponding Source conveyed, and Installation Information provided,
326
+ in accord with this section must be in a format that is publicly
327
+ documented (and with an implementation available to the public in
328
+ source code form), and must require no special password or key for
329
+ unpacking, reading or copying.
330
+
331
+ 7. Additional Terms.
332
+
333
+ "Additional permissions" are terms that supplement the terms of this
334
+ License by making exceptions from one or more of its conditions.
335
+ Additional permissions that are applicable to the entire Program shall
336
+ be treated as though they were included in this License, to the extent
337
+ that they are valid under applicable law. If additional permissions
338
+ apply only to part of the Program, that part may be used separately
339
+ under those permissions, but the entire Program remains governed by
340
+ this License without regard to the additional permissions.
341
+
342
+ When you convey a copy of a covered work, you may at your option
343
+ remove any additional permissions from that copy, or from any part of
344
+ it. (Additional permissions may be written to require their own
345
+ removal in certain cases when you modify the work.) You may place
346
+ additional permissions on material, added by you to a covered work,
347
+ for which you have or can give appropriate copyright permission.
348
+
349
+ Notwithstanding any other provision of this License, for material you
350
+ add to a covered work, you may (if authorized by the copyright holders of
351
+ that material) supplement the terms of this License with terms:
352
+
353
+ a) Disclaiming warranty or limiting liability differently from the
354
+ terms of sections 15 and 16 of this License; or
355
+
356
+ b) Requiring preservation of specified reasonable legal notices or
357
+ author attributions in that material or in the Appropriate Legal
358
+ Notices displayed by works containing it; or
359
+
360
+ c) Prohibiting misrepresentation of the origin of that material, or
361
+ requiring that modified versions of such material be marked in
362
+ reasonable ways as different from the original version; or
363
+
364
+ d) Limiting the use for publicity purposes of names of licensors or
365
+ authors of the material; or
366
+
367
+ e) Declining to grant rights under trademark law for use of some
368
+ trade names, trademarks, or service marks; or
369
+
370
+ f) Requiring indemnification of licensors and authors of that
371
+ material by anyone who conveys the material (or modified versions of
372
+ it) with contractual assumptions of liability to the recipient, for
373
+ any liability that these contractual assumptions directly impose on
374
+ those licensors and authors.
375
+
376
+ All other non-permissive additional terms are considered "further
377
+ restrictions" within the meaning of section 10. If the Program as you
378
+ received it, or any part of it, contains a notice stating that it is
379
+ governed by this License along with a term that is a further
380
+ restriction, you may remove that term. If a license document contains
381
+ a further restriction but permits relicensing or conveying under this
382
+ License, you may add to a covered work material governed by the terms
383
+ of that license document, provided that the further restriction does
384
+ not survive such relicensing or conveying.
385
+
386
+ If you add terms to a covered work in accord with this section, you
387
+ must place, in the relevant source files, a statement of the
388
+ additional terms that apply to those files, or a notice indicating
389
+ where to find the applicable terms.
390
+
391
+ Additional terms, permissive or non-permissive, may be stated in the
392
+ form of a separately written license, or stated as exceptions;
393
+ the above requirements apply either way.
394
+
395
+ 8. Termination.
396
+
397
+ You may not propagate or modify a covered work except as expressly
398
+ provided under this License. Any attempt otherwise to propagate or
399
+ modify it is void, and will automatically terminate your rights under
400
+ this License (including any patent licenses granted under the third
401
+ paragraph of section 11).
402
+
403
+ However, if you cease all violation of this License, then your
404
+ license from a particular copyright holder is reinstated (a)
405
+ provisionally, unless and until the copyright holder explicitly and
406
+ finally terminates your license, and (b) permanently, if the copyright
407
+ holder fails to notify you of the violation by some reasonable means
408
+ prior to 60 days after the cessation.
409
+
410
+ Moreover, your license from a particular copyright holder is
411
+ reinstated permanently if the copyright holder notifies you of the
412
+ violation by some reasonable means, this is the first time you have
413
+ received notice of violation of this License (for any work) from that
414
+ copyright holder, and you cure the violation prior to 30 days after
415
+ your receipt of the notice.
416
+
417
+ Termination of your rights under this section does not terminate the
418
+ licenses of parties who have received copies or rights from you under
419
+ this License. If your rights have been terminated and not permanently
420
+ reinstated, you do not qualify to receive new licenses for the same
421
+ material under section 10.
422
+
423
+ 9. Acceptance Not Required for Having Copies.
424
+
425
+ You are not required to accept this License in order to receive or
426
+ run a copy of the Program. Ancillary propagation of a covered work
427
+ occurring solely as a consequence of using peer-to-peer transmission
428
+ to receive a copy likewise does not require acceptance. However,
429
+ nothing other than this License grants you permission to propagate or
430
+ modify any covered work. These actions infringe copyright if you do
431
+ not accept this License. Therefore, by modifying or propagating a
432
+ covered work, you indicate your acceptance of this License to do so.
433
+
434
+ 10. Automatic Licensing of Downstream Recipients.
435
+
436
+ Each time you convey a covered work, the recipient automatically
437
+ receives a license from the original licensors, to run, modify and
438
+ propagate that work, subject to this License. You are not responsible
439
+ for enforcing compliance by third parties with this License.
440
+
441
+ An "entity transaction" is a transaction transferring control of an
442
+ organization, or substantially all assets of one, or subdividing an
443
+ organization, or merging organizations. If propagation of a covered
444
+ work results from an entity transaction, each party to that
445
+ transaction who receives a copy of the work also receives whatever
446
+ licenses to the work the party's predecessor in interest had or could
447
+ give under the previous paragraph, plus a right to possession of the
448
+ Corresponding Source of the work from the predecessor in interest, if
449
+ the predecessor has it or can get it with reasonable efforts.
450
+
451
+ You may not impose any further restrictions on the exercise of the
452
+ rights granted or affirmed under this License. For example, you may
453
+ not impose a license fee, royalty, or other charge for exercise of
454
+ rights granted under this License, and you may not initiate litigation
455
+ (including a cross-claim or counterclaim in a lawsuit) alleging that
456
+ any patent claim is infringed by making, using, selling, offering for
457
+ sale, or importing the Program or any portion of it.
458
+
459
+ 11. Patents.
460
+
461
+ A "contributor" is a copyright holder who authorizes use under this
462
+ License of the Program or a work on which the Program is based. The
463
+ work thus licensed is called the contributor's "contributor version".
464
+
465
+ A contributor's "essential patent claims" are all patent claims
466
+ owned or controlled by the contributor, whether already acquired or
467
+ hereafter acquired, that would be infringed by some manner, permitted
468
+ by this License, of making, using, or selling its contributor version,
469
+ but do not include claims that would be infringed only as a
470
+ consequence of further modification of the contributor version. For
471
+ purposes of this definition, "control" includes the right to grant
472
+ patent sublicenses in a manner consistent with the requirements of
473
+ this License.
474
+
475
+ Each contributor grants you a non-exclusive, worldwide, royalty-free
476
+ patent license under the contributor's essential patent claims, to
477
+ make, use, sell, offer for sale, import and otherwise run, modify and
478
+ propagate the contents of its contributor version.
479
+
480
+ In the following three paragraphs, a "patent license" is any express
481
+ agreement or commitment, however denominated, not to enforce a patent
482
+ (such as an express permission to practice a patent or covenant not to
483
+ sue for patent infringement). To "grant" such a patent license to a
484
+ party means to make such an agreement or commitment not to enforce a
485
+ patent against the party.
486
+
487
+ If you convey a covered work, knowingly relying on a patent license,
488
+ and the Corresponding Source of the work is not available for anyone
489
+ to copy, free of charge and under the terms of this License, through a
490
+ publicly available network server or other readily accessible means,
491
+ then you must either (1) cause the Corresponding Source to be so
492
+ available, or (2) arrange to deprive yourself of the benefit of the
493
+ patent license for this particular work, or (3) arrange, in a manner
494
+ consistent with the requirements of this License, to extend the patent
495
+ license to downstream recipients. "Knowingly relying" means you have
496
+ actual knowledge that, but for the patent license, your conveying the
497
+ covered work in a country, or your recipient's use of the covered work
498
+ in a country, would infringe one or more identifiable patents in that
499
+ country that you have reason to believe are valid.
500
+
501
+ If, pursuant to or in connection with a single transaction or
502
+ arrangement, you convey, or propagate by procuring conveyance of, a
503
+ covered work, and grant a patent license to some of the parties
504
+ receiving the covered work authorizing them to use, propagate, modify
505
+ or convey a specific copy of the covered work, then the patent license
506
+ you grant is automatically extended to all recipients of the covered
507
+ work and works based on it.
508
+
509
+ A patent license is "discriminatory" if it does not include within
510
+ the scope of its coverage, prohibits the exercise of, or is
511
+ conditioned on the non-exercise of one or more of the rights that are
512
+ specifically granted under this License. You may not convey a covered
513
+ work if you are a party to an arrangement with a third party that is
514
+ in the business of distributing software, under which you make payment
515
+ to the third party based on the extent of your activity of conveying
516
+ the work, and under which the third party grants, to any of the
517
+ parties who would receive the covered work from you, a discriminatory
518
+ patent license (a) in connection with copies of the covered work
519
+ conveyed by you (or copies made from those copies), or (b) primarily
520
+ for and in connection with specific products or compilations that
521
+ contain the covered work, unless you entered into that arrangement,
522
+ or that patent license was granted, prior to 28 March 2007.
523
+
524
+ Nothing in this License shall be construed as excluding or limiting
525
+ any implied license or other defenses to infringement that may
526
+ otherwise be available to you under applicable patent law.
527
+
528
+ 12. No Surrender of Others' Freedom.
529
+
530
+ If conditions are imposed on you (whether by court order, agreement or
531
+ otherwise) that contradict the conditions of this License, they do not
532
+ excuse you from the conditions of this License. If you cannot convey a
533
+ covered work so as to satisfy simultaneously your obligations under this
534
+ License and any other pertinent obligations, then as a consequence you may
535
+ not convey it at all. For example, if you agree to terms that obligate you
536
+ to collect a royalty for further conveying from those to whom you convey
537
+ the Program, the only way you could satisfy both those terms and this
538
+ License would be to refrain entirely from conveying the Program.
539
+
540
+ 13. Remote Network Interaction; Use with the GNU General Public License.
541
+
542
+ Notwithstanding any other provision of this License, if you modify the
543
+ Program, your modified version must prominently offer all users
544
+ interacting with it remotely through a computer network (if your version
545
+ supports such interaction) an opportunity to receive the Corresponding
546
+ Source of your version by providing access to the Corresponding Source
547
+ from a network server at no charge, through some standard or customary
548
+ means of facilitating copying of software. This Corresponding Source
549
+ shall include the Corresponding Source for any work covered by version 3
550
+ of the GNU General Public License that is incorporated pursuant to the
551
+ following paragraph.
552
+
553
+ Notwithstanding any other provision of this License, you have
554
+ permission to link or combine any covered work with a work licensed
555
+ under version 3 of the GNU General Public License into a single
556
+ combined work, and to convey the resulting work. The terms of this
557
+ License will continue to apply to the part which is the covered work,
558
+ but the work with which it is combined will remain governed by version
559
+ 3 of the GNU General Public License.
560
+
561
+ 14. Revised Versions of this License.
562
+
563
+ The Free Software Foundation may publish revised and/or new versions of
564
+ the GNU Affero General Public License from time to time. Such new versions
565
+ will be similar in spirit to the present version, but may differ in detail to
566
+ address new problems or concerns.
567
+
568
+ Each version is given a distinguishing version number. If the
569
+ Program specifies that a certain numbered version of the GNU Affero General
570
+ Public License "or any later version" applies to it, you have the
571
+ option of following the terms and conditions either of that numbered
572
+ version or of any later version published by the Free Software
573
+ Foundation. If the Program does not specify a version number of the
574
+ GNU Affero General Public License, you may choose any version ever published
575
+ by the Free Software Foundation.
576
+
577
+ If the Program specifies that a proxy can decide which future
578
+ versions of the GNU Affero General Public License can be used, that proxy's
579
+ public statement of acceptance of a version permanently authorizes you
580
+ to choose that version for the Program.
581
+
582
+ Later license versions may give you additional or different
583
+ permissions. However, no additional obligations are imposed on any
584
+ author or copyright holder as a result of your choosing to follow a
585
+ later version.
586
+
587
+ 15. Disclaimer of Warranty.
588
+
589
+ THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
590
+ APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
591
+ HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
592
+ OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
593
+ THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
594
+ PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
595
+ IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
596
+ ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
597
+
598
+ 16. Limitation of Liability.
599
+
600
+ IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
601
+ WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
602
+ THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
603
+ GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
604
+ USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
605
+ DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
606
+ PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
607
+ EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
608
+ SUCH DAMAGES.
609
+
610
+ 17. Interpretation of Sections 15 and 16.
611
+
612
+ If the disclaimer of warranty and limitation of liability provided
613
+ above cannot be given local legal effect according to their terms,
614
+ reviewing courts shall apply local law that most closely approximates
615
+ an absolute waiver of all civil liability in connection with the
616
+ Program, unless a warranty or assumption of liability accompanies a
617
+ copy of the Program in return for a fee.
618
+
619
+ END OF TERMS AND CONDITIONS
620
+
621
+ How to Apply These Terms to Your New Programs
622
+
623
+ If you develop a new program, and you want it to be of the greatest
624
+ possible use to the public, the best way to achieve this is to make it
625
+ free software which everyone can redistribute and change under these terms.
626
+
627
+ To do so, attach the following notices to the program. It is safest
628
+ to attach them to the start of each source file to most effectively
629
+ state the exclusion of warranty; and each file should have at least
630
+ the "copyright" line and a pointer to where the full notice is found.
631
+
632
+ <one line to give the program's name and a brief idea of what it does.>
633
+ Copyright (C) <year> <name of author>
634
+
635
+ This program is free software: you can redistribute it and/or modify
636
+ it under the terms of the GNU Affero General Public License as published by
637
+ the Free Software Foundation, either version 3 of the License, or
638
+ (at your option) any later version.
639
+
640
+ This program is distributed in the hope that it will be useful,
641
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
642
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
643
+ GNU Affero General Public License for more details.
644
+
645
+ You should have received a copy of the GNU Affero General Public License
646
+ along with this program. If not, see <https://www.gnu.org/licenses/>.
647
+
648
+ Also add information on how to contact you by electronic and paper mail.
649
+
650
+ If your software can interact with users remotely through a computer
651
+ network, you should also make sure that it provides a way for users to
652
+ get its source. For example, if your program is a web application, its
653
+ interface could display a "Source" link that leads users to an archive
654
+ of the code. There are many ways you could offer source, and different
655
+ solutions will be better for different programs; see section 13 for the
656
+ specific requirements.
657
+
658
+ You should also get your employer (if you work as a programmer) or school,
659
+ if any, to sign a "copyright disclaimer" for the program, if necessary.
660
+ For more information on this, and how to apply and follow the GNU AGPL, see
661
+ <https://www.gnu.org/licenses/>.
Makefile ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # As a workaround to [1], we will use a makefile instead
2
+ # [1]: https://github.com/python-poetry/poetry/issues/241
3
+
4
+ .PHONY: install test xmeta-demo lint
5
+
6
+ test: # run tests
7
+ @poetry run mamba --format=documentation ./spec
8
+ @poetry run ./spec/cli_spec.sh
9
+
10
+ lint: # run lint
11
+ @poetry run pylint ./spec ./pdfxmeta ./pdftocgen ./fitzutils ./pdftocio
12
+
13
+ xmeta-demo: # a demo of pdfxmeta
14
+ @poetry run pdfxmeta ./spec/files/level2.pdf "Section"
15
+
16
+ tocgen-demo: # a demo of tocgen
17
+ @poetry run pdftocgen ./spec/files/level2.pdf < ./recipes/default_latex.toml
18
+
19
+ install: # set up non-dev dependencies
20
+ poetry install --no-dev
21
+
22
+ dev: # set up dev dependencies
23
+ poetry install
24
+
25
+ publish: test # publish package to pypi
26
+ poetry publish --build
QUICK_START.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # PDF ToC Generation Quick Start
2
+
3
+ Optional: Run as App
4
+ ```bash
5
+ streamlit run app.py
6
+ ```
7
+ This will open a local web page where you can upload a PDF, analyze fonts, and generate bookmarks with one click.
8
+
9
+ ### Find Header Candidates
10
+ If you don't know the font size/name of your chapters, this lists the top 25 largest text elements.
11
+ ```bash
12
+ python utils/list_longest_fonts.py <input.pdf>
13
+ ```
14
+ *Output: Font Name, Size, Physical Page, Logical Page Label.*
15
+
16
+ ### Find Header by Context
17
+ If you know a specific string (e.g., the first sentence of a chapter) but can't find the header itself, this finds the element *immediately preceding* that string.
18
+ ```bash
19
+ python utils/find_preceding.py <input.pdf> "known text string"
20
+ ```
21
+
22
+ ### Debug Text Artifacts
23
+ If your bookmarks have weird characters (e.g., `??`), use this to see the raw byte codes (looking for soft hyphens `\xad`, non-breaking spaces `\xa0`, etc.).
24
+ ```bash
25
+ python utils/inspect_bytes.py <input.pdf> "Problematic String"
26
+ ```
27
+
28
+ ---
29
+
30
+ ## Recipe Generation (pdfxmeta)
31
+ Once you have identified the visual style of your headers (e.g., "Caslon 54pt"), you can inspect specific text or automatically generate recipe entries using `pdfxmeta`.
32
+
33
+ ### Inspect Font Details
34
+ To get the exact font name and size of a specific string (e.g., "Chapter 1"):
35
+ ```bash
36
+ pdfxmeta input.pdf "Chapter 1"
37
+ ```
38
+ *Output will show `font.name`, `font.size`, etc.*
39
+
40
+ ### Auto-Generate Recipe Entry
41
+ To append a valid TOML filter directly to your recipe file (level 1 header):
42
+ ```bash
43
+ pdfxmeta -a 1 input.pdf "Chapter 1" >> recipe.toml
44
+ ```
45
+
46
+ ---
47
+
48
+ ## The Pipeline
49
+ Run the full extraction and generation pipeline.
50
+
51
+ ### Middleware: `modify_toc.py`
52
+ We use a custom Python script to:
53
+ 1. **Sanitize Text**: Removes soft hyphens (`\xad`) and cleans encodings.
54
+ 2. **Format Labels**: Renames bookmarks to `001_Title_pgX`.
55
+ 3. **Fix Encoding**: Forces UTF-8 handling to prevent pipe corruption.
56
+
57
+ ### The Command
58
+ **Git Bash** is recommended to avoid PowerShell encoding issues.
59
+
60
+ ```bash
61
+ pdftocgen -r recipe.toml input.pdf | python utils/modify_toc.py | pdftocio -o output.pdf input.pdf
62
+ ```
README ADDED
@@ -0,0 +1,214 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ pdf.tocgen
2
+ ==========
3
+
4
+ in.pdf
5
+ |
6
+ |
7
+ +----------------------+--------------------+
8
+ | | |
9
+ V V V
10
+ +----------+ +-----------+ +----------+
11
+ | | recipe | | ToC | |
12
+ | pdfxmeta +--------->| pdftocgen +-------->| pdftocio +---> out.pdf
13
+ | | | | | |
14
+ +----------+ +-----------+ +----------+
15
+
16
+ pdf.tocgen is a set of command-line tools for automatically
17
+ extracting and generating the table of contents (ToC) of a
18
+ PDF file. It uses the embedded font attributes and position
19
+ of headings to deduce the basic outline of a PDF file.
20
+
21
+ It works best for PDF files produces from a TeX document
22
+ using pdftex (and its friends pdflatex, pdfxetex, etc.), but
23
+ it's designed to work with any *software-generated* PDF
24
+ files (i.e. you shouldn't expect it to work with scanned
25
+ PDFs). Some examples include troff/groff, Adobe InDesign,
26
+ Microsoft Word, and probably more.
27
+
28
+ Please see the homepage [1] for a detailed introduction.
29
+
30
+ Installation
31
+ ------------
32
+
33
+ pdf.tocgen is written in Python 3. It is known to work with
34
+ Python 3.7 to 3.11 on Linux, Windows, and macOS (On BSDs,
35
+ you probably need to build PyMuPDF yourself). Use
36
+
37
+ $ pip install -U pdf.tocgen
38
+
39
+ to install the latest version systemwide. Alternatively, use
40
+ `pipx` or
41
+
42
+ $ pip install -U --user pdf.tocgen
43
+
44
+ to install it for the current user. I would recommend the
45
+ latter approach to avoid messing up the package manager on
46
+ your system.
47
+
48
+ If you are using an Arch-based Linux distro, the package is
49
+ also available on AUR [8]. It can be installed using any AUR
50
+ helper, for example yay:
51
+
52
+ $ yay -S pdf.tocgen
53
+
54
+ Workflow
55
+ --------
56
+
57
+ The design of pdf.tocgen is influenced by the Unix philosophy [2].
58
+ I intentionally separated pdf.tocgen to 3 separate programs.
59
+ They work together, but each of them is useful on their own.
60
+
61
+ 1. pdfxmeta: extract the metadata (font attributes, positions)
62
+ of headings to build a *recipe* file.
63
+ 2. pdftocgen: generate a table of contents from the recipe.
64
+ 3. pdftocio: import the table of contents to the PDF document.
65
+
66
+ You should read the example [3] on the homepage for a proper
67
+ introduction, but the basic workflow follows like this.
68
+
69
+ First, use pdfxmeta to search for the metadata of headings,
70
+ and generate *heading filters* using the automatic setting
71
+
72
+ $ pdfxmeta -p page -a 1 in.pdf "Section" >> recipe.toml
73
+ $ pdfxmeta -p page -a 2 in.pdf "Subsection" >> recipe.toml
74
+
75
+ Note that `page` needs to be replaced by the page number of
76
+ the search keyword.
77
+
78
+ The output `recipe.toml` file would contain several heading
79
+ filters, each of which specifies the attribute of a heading
80
+ at a particular level should have.
81
+
82
+ An example recipe file would look like this:
83
+
84
+ [[heading]]
85
+ level = 1
86
+ greedy = true
87
+ font.name = "Times-Bold"
88
+ font.size = 19.92530059814453
89
+
90
+ [[heading]]
91
+ level = 2
92
+ greedy = true
93
+ font.name = "Times-Bold"
94
+ font.size = 11.9552001953125
95
+
96
+ Then pass the recipe to `pdftocgen` to generate a table of
97
+ contents,
98
+
99
+ $ pdftocgen in.pdf < recipe.toml
100
+ "Preface" 5
101
+ "Bottom-up Design" 5
102
+ "Plan of the Book" 7
103
+ "Examples" 9
104
+ "Acknowledgements" 9
105
+ "Contents" 11
106
+ "The Extensible Language" 14
107
+ "1.1 Design by Evolution" 14
108
+ "1.2 Programming Bottom-Up" 16
109
+ "1.3 Extensible Software" 18
110
+ "1.4 Extending Lisp" 19
111
+ "1.5 Why Lisp (or When)" 21
112
+ "Functions" 22
113
+ "2.1 Functions as Data" 22
114
+ "2.2 Defining Functions" 23
115
+ "2.3 Functional Arguments" 26
116
+ "2.4 Functions as Properties" 28
117
+ "2.5 Scope" 29
118
+ "2.6 Closures" 30
119
+ "2.7 Local Functions" 34
120
+ "2.8 Tail-Recursion" 35
121
+ "2.9 Compilation" 37
122
+ "2.10 Functions from Lists" 40
123
+ "Functional Programming" 41
124
+ "3.1 Functional Design" 41
125
+ "3.2 Imperative Outside-In" 46
126
+ "3.3 Functional Interfaces" 48
127
+ "3.4 Interactive Programming" 50
128
+ [--snip--]
129
+
130
+ which can be directly imported to the PDF file using
131
+ `pdftocio`,
132
+
133
+ $ pdftocgen in.pdf < recipe.toml | pdftocio -o out.pdf in.pdf
134
+
135
+ Or if you want to edit the table of contents before
136
+ importing it,
137
+
138
+ $ pdftocgen in.pdf < recipe.toml > toc
139
+ $ vim toc # edit
140
+ $ pdftocio in.pdf < toc
141
+
142
+ Each of the three programs has some extra functionalities.
143
+ Use the -h option to see all the options you could pass in.
144
+
145
+ Development
146
+ -----------
147
+
148
+ If you want to modify the source code or contribute anything,
149
+ first install poetry [4], which is a dependency and package
150
+ manager for Python used by pdf.tocgen. Then run
151
+
152
+ $ poetry install
153
+
154
+ in the root directory of this repository to set up
155
+ development dependencies.
156
+
157
+ If you want to test the development version of pdf.tocgen,
158
+ use the `poetry run` command:
159
+
160
+ $ poetry run pdfxmeta in.pdf "pattern"
161
+
162
+ Alternatively, you could also use the
163
+
164
+ $ poetry shell
165
+
166
+ command to open up a virtual environment and run the
167
+ development version directly:
168
+
169
+ (pdf.tocgen) $ pdfxmeta in.pdf "pattern"
170
+
171
+ Before you send a patch or pull request, make sure the unit
172
+ test passes by running:
173
+
174
+ $ make test
175
+
176
+ GUI front end
177
+ -------------
178
+
179
+ If you are a Emacs user, you could install Daniel Nicolai's
180
+ toc-mode [9] package as a GUI front end for pdf.tocgen,
181
+ though it offers many more functionalities, such as
182
+ extracting (printed) table of contents from a PDF file. Note
183
+ that it uses pdf.tocgen under the hood, so you still need to
184
+ install pdf.tocgen before using toc-mode as a front end for
185
+ pdf.tocgen.
186
+
187
+ License
188
+ -------
189
+
190
+ pdf.tocgen itself a is free software. The source code of
191
+ pdf.tocgen is licensed under the GNU GPLv3 license. However,
192
+ the recipes in the `recipes` directory is separately
193
+ licensed under the CC BY-NC-SA 4.0 License [7] to prevent
194
+ any commercial usage, and thus not included in the
195
+ distribution.
196
+
197
+ pdf.tocgen is based on PyMuPDF [5], licensed under the GNU
198
+ GPLv3 license, which is again based on MuPDF [6], licensed
199
+ under the GNU AGPLv3 license. A copy of the AGPLv3 license
200
+ is included in the repository.
201
+
202
+ If you want to make any derivatives based on this project,
203
+ please follow the terms of the GNU GPLv3 license.
204
+
205
+
206
+ [1]: https://krasjet.com/voice/pdf.tocgen/
207
+ [2]: https://en.wikipedia.org/wiki/Unix_philosophy
208
+ [3]: https://krasjet.com/voice/pdf.tocgen/#a-worked-example
209
+ [4]: https://python-poetry.org/
210
+ [5]: https://github.com/pymupdf/PyMuPDF
211
+ [6]: https://mupdf.com/docs/index.html
212
+ [7]: https://creativecommons.org/licenses/by-nc-sa/4.0/
213
+ [8]: https://aur.archlinux.org/packages/pdf.tocgen/
214
+ [9]: https://github.com/dalanicolai/toc-mode
README.md CHANGED
@@ -1,20 +1,30 @@
1
- ---
2
- title: Pdf.tocgen.split
3
- emoji: 🚀
4
- colorFrom: red
5
- colorTo: red
6
- sdk: docker
7
- app_port: 8501
8
- tags:
9
- - streamlit
10
- pinned: false
11
- short_description: Split PDF by headings based on Krasjet pdf.tocgen
12
- license: gpl-2.0
13
- ---
14
-
15
- # Welcome to Streamlit!
16
-
17
- Edit `/src/streamlit_app.py` to customize this app to your heart's desire. :heart:
18
-
19
- If you have any questions, checkout our [documentation](https://docs.streamlit.io) and [community
20
- forums](https://discuss.streamlit.io).
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: PDF TOC Generator Split
3
+ emoji: 📑
4
+ colorFrom: blue
5
+ colorTo: indigo
6
+ sdk: streamlit
7
+ sdk_version: 1.41.1
8
+ app_file: app.py
9
+ pinned: false
10
+ license: agpl-3.0
11
+ short_description: Generate PDF Table of Contents and Split Chapters
12
+ ---
13
+
14
+ # PDF Table of Contents Generator (Split Edition)
15
+
16
+ Based on [pdf.tocgen](https://github.com/Krasjet/pdf.tocgen).
17
+
18
+ ## Features
19
+ - **Analyze Fonts**: Automatically detect chapter headers by font size and style.
20
+ - **Search**: Find headers by text search (Case Sensitive option available).
21
+ - **Generate TOC**: Create a clickable PDF bookmark outline.
22
+ - **Split Chapters**: Export each chapter as a separate PDF in a ZIP file.
23
+ - **Front/Back Matter**: Automatically handle un-numbered front matter and user-defined back matter (Index, Glossary).
24
+
25
+ ## Usage
26
+ 1. Upload a PDF.
27
+ 2. Use "Scan & Generate" to find headers.
28
+ 3. Configure the "Back Matter" start page if needed.
29
+ 4. Run Pipeline.
30
+ 5. Download the Bookmarked PDF or the Zipped Chapter Splits.
app.py ADDED
@@ -0,0 +1,381 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+ import pandas as pd
3
+ import fitz # PyMuPDF
4
+ import os
5
+ import subprocess
6
+ import tempfile
7
+ import sys
8
+ import toml
9
+ import shutil
10
+ import zipfile
11
+ import io
12
+
13
+ # Ensure we can import from utils if needed
14
+ sys.path.append(os.path.dirname(__file__))
15
+ from utils import toc_processor
16
+ from pdfxmeta import pdfxmeta
17
+
18
+ st.set_page_config(page_title="PDF Bookmark Generator", layout="wide")
19
+
20
+ st.title("PDF Table of Contents Generator")
21
+
22
+ st.markdown("""
23
+ **Upload a PDF**, analyze its fonts to find headers, and generate a clean Table of Contents.
24
+ """)
25
+
26
+ uploaded_file = st.file_uploader("Choose a PDF file", type="pdf")
27
+
28
+ if uploaded_file is not None:
29
+ # We need to save the uploaded file to disk for the CLI tools to read it
30
+ # We'll use a permanent temp file for the session so we don't have to re-upload constantly
31
+ # But for cleanliness, we might want to put this in a temp dir too?
32
+ # For now, keeping the input file logic as is (tempfile), but we'll put OUTPUTS in a pure temp dir
33
+
34
+ with tempfile.NamedTemporaryFile(delete=False, suffix=".pdf") as tmp_pdf:
35
+ tmp_pdf.write(uploaded_file.getvalue())
36
+ input_pdf_path = tmp_pdf.name
37
+
38
+ # --- State Management & Reset ---
39
+ # Check if a new file is uploaded
40
+ file_id = f"{uploaded_file.name}_{uploaded_file.size}" # Robust proxy for ID
41
+ if 'current_file_id' not in st.session_state:
42
+ st.session_state['current_file_id'] = None
43
+
44
+ if st.session_state['current_file_id'] != file_id:
45
+ # NEW FILE DETECTED: Reset Pipeline State
46
+ keys_to_reset = ['final_pdf_bytes', 'final_zip_bytes', 'final_zip_name', 'search_matches', 'font_name', 'font_size']
47
+ for k in keys_to_reset:
48
+ if k in st.session_state:
49
+ del st.session_state[k]
50
+ st.session_state['current_file_id'] = file_id
51
+ # st.toast(f"New file loaded: {uploaded_file.name}. State cleared.")
52
+
53
+ st.success(f"Loaded: {uploaded_file.name}")
54
+
55
+ # --- Data Source Selection ---
56
+ st.header("1. Source Selection")
57
+ source_mode = st.radio("Where should the bookmarks come from?",
58
+ ["Scan & Generate (Create New)", "Use Existing Bookmarks (Modify)"],
59
+ help="Choose 'Scan & Generate' to build new bookmarks from fonts. Choose 'Use Existing' to tidy up bookmarks already in the file.")
60
+
61
+ # --- Analysis Section (Only for Generate) ---
62
+ if source_mode == "Scan & Generate (Create New)":
63
+ st.header("2. Analyze Fonts")
64
+
65
+ if 'font_name' not in st.session_state:
66
+ st.session_state['font_name'] = ''
67
+ if 'font_size' not in st.session_state:
68
+ st.session_state['font_size'] = 18.0
69
+
70
+ tab1, tab2 = st.tabs(["Scan for Large Fonts", "Search by Text"])
71
+
72
+ with tab1:
73
+ if st.button("Find Header Candidates"):
74
+ with st.spinner("Scanning PDF for large fonts..."):
75
+ doc = fitz.open(input_pdf_path)
76
+ candidates = []
77
+ for page in doc[:50]:
78
+ text_page = page.get_text("dict")
79
+ for block in text_page["blocks"]:
80
+ for line in block.get("lines", []):
81
+ for span in line["spans"]:
82
+ text = span["text"].strip()
83
+ if len(text) > 3:
84
+ candidates.append({
85
+ "Text": text[:50],
86
+ "Font": span["font"],
87
+ "Size": round(span["size"], 2),
88
+ "Page": page.number + 1
89
+ })
90
+ doc.close()
91
+ if candidates:
92
+ df = pd.DataFrame(candidates)
93
+ summary = df.groupby(['Font', 'Size']).size().reset_index(name='Count')
94
+ summary = summary.sort_values(by=['Size', 'Count'], ascending=[False, False]).head(20)
95
+ st.session_state['scan_results'] = summary
96
+ else:
97
+ st.warning("No text found.")
98
+
99
+ if 'scan_results' in st.session_state:
100
+ st.write("### Top Large Fonts Found")
101
+ st.dataframe(st.session_state['scan_results'], use_container_width=True)
102
+
103
+ def update_from_scan():
104
+ val = st.session_state.scan_selector
105
+ if val:
106
+ f_name = val.split(" (")[0]
107
+ f_size = float(val.split("(")[1].replace("pt)", ""))
108
+ st.session_state['font_name'] = f_name
109
+ st.session_state['font_size'] = f_size
110
+
111
+ options = st.session_state['scan_results'].apply(lambda x: f"{x['Font']} ({x['Size']}pt)", axis=1)
112
+ st.selectbox("Select extraction font:", options, key='scan_selector', on_change=update_from_scan, index=None, placeholder="Choose a font...")
113
+
114
+ with tab2:
115
+ search_query = st.text_input("Enter text to find (e.g., 'Chapter 1')", "")
116
+
117
+ c1, c2 = st.columns([1, 3])
118
+ with c1:
119
+ do_search = st.button("Search Text")
120
+ with c2:
121
+ is_case_sensitive = st.checkbox("Case Sensitive", value=False)
122
+
123
+ if do_search:
124
+ with st.spinner(f"Searching for '{search_query}'..."):
125
+ # Use the robust pdfxmeta library
126
+ try:
127
+ doc = fitz.open(input_pdf_path)
128
+ # pdfxmeta expects a regex pattern, so we escape the query to be safe
129
+ import re
130
+ safe_pattern = re.escape(search_query)
131
+
132
+ # extract_meta returns a list of dicts (spans)
133
+ results = pdfxmeta.extract_meta(doc, safe_pattern, ign_case=(not is_case_sensitive))
134
+ doc.close()
135
+
136
+ matches = []
137
+ for res in results:
138
+ matches.append({
139
+ "Text": res.get("text", "").strip(),
140
+ "Font": res.get("font", ""),
141
+ "Size": round(res.get("size", 0), 2),
142
+ "Page": res.get("page_index", 0)
143
+ })
144
+ # Limit for display safety
145
+ if len(matches) > 50: break
146
+
147
+ if matches:
148
+ st.session_state['search_matches'] = pd.DataFrame(matches)
149
+ else:
150
+ st.warning("No matches found.")
151
+
152
+ except Exception as e:
153
+ st.error(f"Search failed: {e}")
154
+
155
+ if 'search_matches' in st.session_state:
156
+ st.write(f"### Found Matches")
157
+ st.dataframe(st.session_state['search_matches'], use_container_width=True)
158
+
159
+ def update_from_search():
160
+ val = st.session_state.search_selector
161
+ if val:
162
+ parts = val.split(" (")
163
+ f_name = parts[0]
164
+ f_size = float(parts[1].split("pt)")[0])
165
+ st.session_state['font_name'] = f_name
166
+ st.session_state['font_size'] = f_size
167
+
168
+ options = st.session_state['search_matches'].apply(lambda x: f"{x['Font']} ({x['Size']}pt) - Pg {x['Page']}", axis=1)
169
+ st.selectbox("Select font from match:", options, key='search_selector', on_change=update_from_search, index=None, placeholder="Choose a match...")
170
+
171
+ # --- Configuration (Only for Generate) ---
172
+ st.header("3. Configure Recipe")
173
+ col1, col2 = st.columns(2)
174
+ with col1:
175
+ font_name_input = st.text_input("Font Name", key='font_name')
176
+ with col2:
177
+ font_size_input = st.number_input("Font Size", key='font_size')
178
+
179
+ greedy = st.checkbox("Greedy Match (Merge multiline specs)", value=True)
180
+
181
+ # --- Back Matter Configuration ---
182
+ with st.expander("Back Matter Configuration (Optional)", expanded=False):
183
+ st.markdown("Identify where the **Back Matter** (Index, Glossary, etc.) starts to split it into a separate `999_Back_matter.pdf`.")
184
+
185
+ # Independent Search for Back Matter
186
+ bm_query = st.text_input("Find Back Matter start (e.g., 'Index')", key="bm_search_query")
187
+
188
+ c_bm1, c_bm2 = st.columns([1, 3])
189
+ with c_bm1:
190
+ do_bm_search = st.button("Search Back Matter")
191
+ with c_bm2:
192
+ bm_case_sensitive = st.checkbox("Case Sensitive", key="bm_sens", value=False)
193
+
194
+ if do_bm_search:
195
+ with st.spinner("Searching..."):
196
+ try:
197
+ doc = fitz.open(input_pdf_path)
198
+ import re
199
+ safe_pattern = re.escape(bm_query)
200
+ results = pdfxmeta.extract_meta(doc, safe_pattern, ign_case=(not bm_case_sensitive))
201
+ doc.close()
202
+
203
+ bm_matches = []
204
+ for res in results:
205
+ bm_matches.append({
206
+ "Text": res.get("text", "").strip(),
207
+ "Page": res.get("page_index", 0) # Display raw (already 1-based from pdfxmeta)
208
+ })
209
+ if len(bm_matches) > 50: break
210
+
211
+ if bm_matches:
212
+ st.session_state['bm_matches'] = pd.DataFrame(bm_matches)
213
+ else:
214
+ st.warning("No matches found.")
215
+ except Exception as e:
216
+ st.error(f"Search failed: {e}")
217
+
218
+ if 'bm_matches' in st.session_state:
219
+ st.dataframe(st.session_state['bm_matches'], use_container_width=True)
220
+
221
+ def update_bm_page():
222
+ val = st.session_state.bm_selector
223
+ if val:
224
+ # Value format: "Page X - Text..."
225
+ page_num = int(val.split(" -")[0].replace("Page ", ""))
226
+ st.session_state['back_matter_page'] = page_num
227
+
228
+ bm_options = st.session_state['bm_matches'].apply(lambda x: f"Page {x['Page']} - {x['Text'][:30]}...", axis=1)
229
+ st.selectbox("Select Start Page:", bm_options, key='bm_selector', on_change=update_bm_page, index=None, placeholder="Select start page...")
230
+
231
+ # Manual Override
232
+ # Update session state when this input changes
233
+ def update_manual_bm():
234
+ st.session_state['back_matter_page'] = st.session_state.back_matter_page_manual
235
+
236
+ st.number_input("Or manually set Start Page:", min_value=0, value=st.session_state.get('back_matter_page', 0), key='back_matter_page_manual', on_change=update_manual_bm)
237
+
238
+ else:
239
+ # Existing Mode
240
+ st.info("Using existing bookmarks. They will be cleaned, numbered, and used for splitting/downloading.")
241
+
242
+ # --- Generation ---
243
+ st.header("4. Process & Generate")
244
+
245
+ if st.button("Run Pipeline"):
246
+ # Validate inputs if generating
247
+ if source_mode == "Scan & Generate (Create New)" and not st.session_state.get('font_name'):
248
+ st.error("Please specify a font name for extraction.")
249
+ else:
250
+ with st.status("Running pipeline tasks...", expanded=True) as status:
251
+ # Use a temporary directory for all intermediate files
252
+ with tempfile.TemporaryDirectory() as temp_dir:
253
+ status.write(f"Created temp workspace: {temp_dir}")
254
+
255
+ # Paths
256
+ recipe_path = os.path.join(temp_dir, "recipe.toml")
257
+ raw_toc_path = os.path.join(temp_dir, "raw.toc") # pdftocgen output
258
+ clean_toc_path = os.path.join(temp_dir, "clean.toc") # modify_toc output
259
+ output_pdf_path = os.path.join(temp_dir, "final.pdf")
260
+
261
+ raw_toc_content = ""
262
+
263
+ if source_mode == "Scan & Generate (Create New)":
264
+ # 1. Create Recipe
265
+ recipe_data = {
266
+ "heading": [{
267
+ "level": 1,
268
+ "greedy": greedy,
269
+ "font": {
270
+ "name": st.session_state['font_name'],
271
+ "size": st.session_state['font_size'],
272
+ "size_tolerance": 0.1
273
+ }
274
+ }]
275
+ }
276
+ with open(recipe_path, "w") as f:
277
+ toml.dump(recipe_data, f)
278
+ status.write("✅ Recipe created")
279
+
280
+ # 2. Run pdftocgen -> raw.toc
281
+ status.write("Running pdftocgen (Scanning)...")
282
+ cmd1 = f'pdftocgen -r "{recipe_path}" "{input_pdf_path}"'
283
+ process = subprocess.run(cmd1, shell=True, capture_output=True, text=True, encoding='utf-8')
284
+ if process.returncode != 0:
285
+ st.error(f"pdftocgen failed: {process.stderr}")
286
+ st.stop()
287
+ raw_toc_content = process.stdout
288
+ status.write("✅ Headers extracted")
289
+
290
+ else:
291
+ # Existing Bookmarks
292
+ status.write("Extracting existing bookmarks...")
293
+ # Run pdftocio in extract mode
294
+ cmd1 = f'pdftocio "{input_pdf_path}"'
295
+ process = subprocess.run(cmd1, shell=True, capture_output=True, text=True, encoding='utf-8')
296
+ if process.returncode != 0:
297
+ st.error(f"pdftocio failed: {process.stderr}")
298
+ st.stop()
299
+ raw_toc_content = process.stdout
300
+ if not raw_toc_content.strip():
301
+ st.warning("No existing bookmarks found!")
302
+ st.stop()
303
+ status.write("✅ Existing bookmarks imported")
304
+
305
+ # 3. Clean Content (Using centralized utility)
306
+ status.write("Cleaning and merging bookmarks...")
307
+ cleaned_toc_content = toc_processor.process_toc(raw_toc_content)
308
+
309
+ with open(clean_toc_path, "w", encoding='utf-8') as f:
310
+ f.write(cleaned_toc_content)
311
+ status.write("✅ Bookmarks formatted (Double-splits fixed)")
312
+
313
+ # 4. Write PDF
314
+ status.write("Writing to PDF...")
315
+ cmd3 = f'pdftocio -t "{clean_toc_path}" -o "{output_pdf_path}" "{input_pdf_path}"'
316
+ process = subprocess.run(cmd3, shell=True, capture_output=True, text=True)
317
+ if process.returncode != 0:
318
+ st.error(f"pdftocio failed: {process.stderr}")
319
+ st.stop()
320
+ status.write("✅ PDF saved")
321
+
322
+ # 5. Read Result for Download
323
+ with open(output_pdf_path, "rb") as f:
324
+ st.session_state['final_pdf_bytes'] = f.read()
325
+
326
+ # 6. Split & Zip (The Feature)
327
+ # Use a temp file for the zip to avoid memory issues
328
+ with tempfile.NamedTemporaryFile(suffix=".zip", delete=False) as tmp_zip:
329
+ tmp_zip_path = tmp_zip.name
330
+
331
+ try:
332
+ # Pass back_matter_page if it exists and is valid
333
+ bm_page = st.session_state.get('back_matter_page', 0)
334
+ if bm_page == 0: bm_page = None
335
+
336
+ toc_processor.generate_chapter_splits(output_pdf_path, tmp_zip_path, back_matter_start_page=bm_page)
337
+
338
+ with open(tmp_zip_path, "rb") as f:
339
+ st.session_state['final_zip_bytes'] = f.read()
340
+
341
+ base_name = os.path.splitext(uploaded_file.name)[0]
342
+ st.session_state['final_zip_name'] = f"{base_name}_chapters.zip"
343
+
344
+ except Exception as e:
345
+ st.error(f"Error generating zip: {e}")
346
+ finally:
347
+ if os.path.exists(tmp_zip_path):
348
+ os.unlink(tmp_zip_path)
349
+
350
+ # --- Persistent Download Area ---
351
+ if 'final_pdf_bytes' in st.session_state:
352
+ st.success("Pipeline completed successfully!")
353
+ st.write("### Downloads")
354
+
355
+ c_dl1, c_dl2 = st.columns(2)
356
+ with c_dl1:
357
+ st.download_button(
358
+ label="Download Bookmarked PDF",
359
+ data=st.session_state['final_pdf_bytes'],
360
+ file_name="bookmarked_doc.pdf",
361
+ mime="application/pdf",
362
+ key="dl_pdf_btn"
363
+ )
364
+
365
+ with c_dl2:
366
+ if 'final_zip_bytes' in st.session_state:
367
+ st.download_button(
368
+ label=f"Download ZIP ({st.session_state['final_zip_name']})",
369
+ data=st.session_state['final_zip_bytes'],
370
+ file_name=st.session_state['final_zip_name'],
371
+ mime="application/zip",
372
+ key="dl_zip_btn"
373
+ )
374
+
375
+ st.markdown("---")
376
+ st.markdown("""
377
+ <div style="text-align: center; color: #666; font-size: 0.8em;">
378
+ Based on <a href="https://github.com/Krasjet/pdf.tocgen" target="_blank">pdf.tocgen</a> by krasjet. <br>
379
+ Enhanced with UI, Chapter Splitting, and Metadata Search. Licensed under AGPL-3.0.
380
+ </div>
381
+ """, unsafe_allow_html=True)
fitzutils/__init__.py ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """A collection of utility functions to work with PyMuPDF"""
2
+
3
+ from .fitzutils import (
4
+ open_pdf,
5
+ ToCEntry,
6
+ dump_toc,
7
+ pprint_toc,
8
+ get_file_encoding
9
+ )
10
+
11
+ __all__ = [
12
+ 'open_pdf',
13
+ 'ToCEntry',
14
+ 'dump_toc',
15
+ 'pprint_toc',
16
+ 'get_file_encoding'
17
+ ]
fitzutils/fitzutils.py ADDED
@@ -0,0 +1,112 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from contextlib import contextmanager
2
+ from dataclasses import dataclass
3
+ from typing import Optional, ContextManager, List, Tuple
4
+ from fitz import Document
5
+
6
+ import sys
7
+ import fitz
8
+ import io
9
+ import csv
10
+ import chardet
11
+
12
+
13
+ @contextmanager
14
+ def open_pdf(path: str,
15
+ exit_on_error: bool = True
16
+ ) -> ContextManager[Optional[Document]]:
17
+ """A context manager for fitz Document
18
+
19
+ This context manager will take care of the error handling when creating a
20
+ fitz Document.
21
+
22
+ Arguments
23
+ path: the path of the pdf file
24
+ exit_on_error: if true, exit with error code 1 when error occurs
25
+ """
26
+ try:
27
+ doc = fitz.open(path)
28
+ except Exception as e:
29
+ if exit_on_error:
30
+ print(f"error: fail to open {path}", file=sys.stderr)
31
+ print(e, file=sys.stderr)
32
+ sys.exit(1)
33
+ else:
34
+ yield None
35
+ else:
36
+ try:
37
+ yield doc
38
+ finally:
39
+ doc.close()
40
+
41
+
42
+ @dataclass
43
+ class ToCEntry:
44
+ """A single entry in the table of contents"""
45
+ level: int
46
+ title: str
47
+ pagenum: int
48
+ # vpos == bbox.top, used for sorting
49
+ vpos: Optional[float] = None
50
+
51
+ @staticmethod
52
+ def key(e) -> Tuple[int, float]:
53
+ """Key used for sorting"""
54
+ return (e.pagenum, 0 if e.vpos is None else e.vpos)
55
+
56
+ def to_fitz_entry(self) -> list:
57
+ return ([self.level, self.title, self.pagenum] +
58
+ [self.vpos] * (self.vpos is not None))
59
+
60
+
61
+ def dump_toc(entries: List[ToCEntry], dump_vpos: bool = False) -> str:
62
+ """Dump table of contents as a CSV dialect
63
+
64
+ We will use indentations to represent the level of each entry, except that,
65
+ everything should be similar to the normal CSV.
66
+
67
+ Argument
68
+ entries: a list of ToC entries
69
+ dump_vpos: if true, the vertical position of a page is also dumped
70
+ Returns
71
+ a multiline string
72
+ """
73
+ with io.StringIO(newline='\n') as out:
74
+ writer = csv.writer(out, lineterminator='\n',
75
+ delimiter=' ', quoting=csv.QUOTE_NONNUMERIC)
76
+ for entry in entries:
77
+ out.write((entry.level - 1) * ' ')
78
+ writer.writerow(
79
+ [entry.title, entry.pagenum] +
80
+ ([entry.vpos] * (dump_vpos and entry.vpos is not None))
81
+ )
82
+ return out.getvalue()
83
+
84
+
85
+ def pprint_toc(entries: List[ToCEntry]) -> str:
86
+ """Pretty print table of contents
87
+
88
+ Argument
89
+ entries: a list of ToC entries
90
+ Returns
91
+ a multiline string
92
+ """
93
+ return '\n'.join([
94
+ f"{(entry.level - 1) * ' '}{entry.title} ··· {entry.pagenum}"
95
+ for entry in entries
96
+ ])
97
+
98
+
99
+ def get_file_encoding(path: str) -> str:
100
+ """Get encoding of file
101
+
102
+ Argument
103
+ path: file path
104
+ Returns
105
+ encoding string
106
+ """
107
+ try:
108
+ with open(path, "rb") as f:
109
+ enc = chardet.detect(f.read()).encoding
110
+ except:
111
+ enc = 'utf-8'
112
+ return enc
pdftocgen/__init__.py ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ """Generate table of contents for pdf based on a recipe file"""
2
+
3
+ __version__ = '1.3.4'
pdftocgen/__main__.py ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ from .app import main
2
+
3
+ if __name__ == '__main__':
4
+ main()
pdftocgen/filter.py ADDED
@@ -0,0 +1,161 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Filter on span dictionaries
2
+
3
+ This module contains the internal representation of heading filters, which are
4
+ used to test if a span should be included in the ToC.
5
+ """
6
+
7
+ import re
8
+
9
+ from typing import Optional
10
+ from re import Pattern
11
+
12
+ DEF_TOLERANCE: float = 1e-5
13
+
14
+
15
+ def admits_float(expect: Optional[float],
16
+ actual: Optional[float],
17
+ tolerance: float) -> bool:
18
+ """Check if a float should be admitted by a filter"""
19
+ return (expect is None) or \
20
+ (actual is not None and abs(expect - actual) <= tolerance)
21
+
22
+
23
+ class FontFilter:
24
+ """Filter on font attributes"""
25
+ name: Pattern
26
+ size: Optional[float]
27
+ size_tolerance: float
28
+ color: Optional[int]
29
+ flags: int
30
+ # besides the usual true (1) and false (0), we have another state,
31
+ # unset (x), where the truth table would be
32
+ # a b diff?
33
+ # 0 0 0
34
+ # 0 1 1
35
+ # 1 0 1
36
+ # 1 1 0
37
+ # x 0 0
38
+ # x 1 0
39
+ # it's very inefficient to compare bit by bit, which would take 5 bitwise
40
+ # operations to compare, and then 4 to combine the results, we will use a
41
+ # trick to reduce it to 2 ops.
42
+ # step 1: use XOR to find different bits. if unset, set bit to 0, we will
43
+ # take care of false positives in the next step
44
+ # a b a^b
45
+ # 0 0 0
46
+ # 0 1 1
47
+ # 1 0 1
48
+ # 1 1 0
49
+ # step 2: use AND with a ignore mask, (0 for ignored) to eliminate false
50
+ # positives
51
+ # a b a&b
52
+ # 0 1 0 <- no diff
53
+ # 0 0 0 <- no diff
54
+ # 1 1 1 <- found difference
55
+ # 1 0 0 <- ignored
56
+ ign_mask: int
57
+
58
+ def __init__(self, font_dict: dict):
59
+ self.name = re.compile(font_dict.get('name', ""))
60
+ self.size = font_dict.get('size')
61
+ self.size_tolerance = font_dict.get('size_tolerance', DEF_TOLERANCE)
62
+ self.color = font_dict.get('color')
63
+ # some branchless trick, mainly to save space
64
+ # x * True = x
65
+ # x * False = 0
66
+ self.flags = (0b00001 * font_dict.get('superscript', False) |
67
+ 0b00010 * font_dict.get('italic', False) |
68
+ 0b00100 * font_dict.get('serif', False) |
69
+ 0b01000 * font_dict.get('monospace', False) |
70
+ 0b10000 * font_dict.get('bold', False))
71
+
72
+ self.ign_mask = (0b00001 * ('superscript' in font_dict) |
73
+ 0b00010 * ('italic' in font_dict) |
74
+ 0b00100 * ('serif' in font_dict) |
75
+ 0b01000 * ('monospace' in font_dict) |
76
+ 0b10000 * ('bold' in font_dict))
77
+
78
+ def admits(self, spn: dict) -> bool:
79
+ """Check if the font attributes admit the span
80
+
81
+ Argument
82
+ spn: the span dict to be checked
83
+ Returns
84
+ False if the span doesn't match current font attribute
85
+ """
86
+ if not self.name.search(spn.get('font', "")):
87
+ return False
88
+
89
+ if self.color is not None and self.color != spn.get('color'):
90
+ return False
91
+
92
+ if not admits_float(self.size, spn.get('size'), self.size_tolerance):
93
+ return False
94
+
95
+ flags = spn.get('flags', ~self.flags)
96
+ # see above for explanation
97
+ return not (flags ^ self.flags) & self.ign_mask
98
+
99
+
100
+ class BoundingBoxFilter:
101
+ """Filter on bounding boxes"""
102
+ left: Optional[float]
103
+ top: Optional[float]
104
+ right: Optional[float]
105
+ bottom: Optional[float]
106
+ tolernace: float
107
+
108
+ def __init__(self, bbox_dict: dict):
109
+ self.left = bbox_dict.get('left')
110
+ self.top = bbox_dict.get('top')
111
+ self.right = bbox_dict.get('right')
112
+ self.bottom = bbox_dict.get('bottom')
113
+ self.tolerance = bbox_dict.get('tolerance', DEF_TOLERANCE)
114
+
115
+ def admits(self, spn: dict) -> bool:
116
+ """Check if the bounding box admit the span
117
+
118
+ Argument
119
+ spn: the span dict to be checked
120
+ Returns
121
+ False if the span doesn't match current bounding box setting
122
+ """
123
+ bbox = spn.get('bbox', (None, None, None, None))
124
+ return (admits_float(self.left, bbox[0], self.tolerance) and
125
+ admits_float(self.top, bbox[1], self.tolerance) and
126
+ admits_float(self.right, bbox[2], self.tolerance) and
127
+ admits_float(self.bottom, bbox[3], self.tolerance))
128
+
129
+
130
+ class ToCFilter:
131
+ """Filter on span dictionary to pick out headings in the ToC"""
132
+ # The level of the title, strictly > 0
133
+ level: int
134
+ # When set, the filter will be more *greedy* and extract all the text in a
135
+ # block even when at least one match occurs
136
+ greedy: bool
137
+ font: FontFilter
138
+ bbox: BoundingBoxFilter
139
+
140
+ def __init__(self, fltr_dict: dict):
141
+ lvl = fltr_dict.get('level')
142
+
143
+ if lvl is None:
144
+ raise ValueError("filter's 'level' is not set")
145
+ if lvl < 1:
146
+ raise ValueError("filter's 'level' must be >= 1")
147
+
148
+ self.level = lvl
149
+ self.greedy = fltr_dict.get('greedy', False)
150
+ self.font = FontFilter(fltr_dict.get('font', {}))
151
+ self.bbox = BoundingBoxFilter(fltr_dict.get('bbox', {}))
152
+
153
+ def admits(self, spn: dict) -> bool:
154
+ """Check if the filter admits the span
155
+
156
+ Arguments
157
+ spn: the span dict to be checked
158
+ Returns
159
+ False if the span doesn't match the filter
160
+ """
161
+ return self.font.admits(spn) and self.bbox.admits(spn)
pdftocgen/recipe.py ADDED
@@ -0,0 +1,188 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from dataclasses import dataclass
2
+ from typing import Optional, List, Dict, Iterator
3
+ from .filter import ToCFilter
4
+ from fitzutils import ToCEntry
5
+ from itertools import chain
6
+ from collections import defaultdict
7
+ from fitz import Document
8
+
9
+
10
+ class FoundGreedy(Exception):
11
+ """A hacky solution to do short-circuiting in Python.
12
+
13
+ The main reason to do this short-circuiting is to untangle the logic of
14
+ greedy filter with normal execution, which makes the typing and code much
15
+ cleaner, but it can also save some unecessary comparisons.
16
+
17
+ Probably similar to call/cc in scheme or longjump in C
18
+ c.f. https://ds26gte.github.io/tyscheme/index-Z-H-15.html#node_sec_13.2
19
+ """
20
+ level: int
21
+
22
+ def __init__(self, level):
23
+ """
24
+ Argument
25
+ level: level of the greedy filter
26
+ """
27
+ super().__init__()
28
+ self.level = level
29
+
30
+
31
+ def blk_to_str(blk: dict) -> str:
32
+ """Extract all the text inside a block"""
33
+ return " ".join([
34
+ spn.get('text', "").strip()
35
+ for line in blk.get('lines', [])
36
+ for spn in line.get('spans', [])
37
+ ])
38
+
39
+
40
+ @dataclass
41
+ class Fragment:
42
+ """A fragment of the extracted heading"""
43
+ text: str
44
+ level: int
45
+
46
+
47
+ def concatFrag(frags: Iterator[Optional[Fragment]], sep: str = " ") -> Dict[int, str]:
48
+ """Concatenate fragments to strings
49
+
50
+ Returns
51
+ a dictionary (level -> title) that contains the title for each level.
52
+ """
53
+ # accumulate a list of strings for each level of heading
54
+ acc = defaultdict(list)
55
+ for frag in frags:
56
+ if frag is not None:
57
+ acc[frag.level].append(frag.text)
58
+
59
+ result = {}
60
+ for level, strs in acc.items():
61
+ result[level] = sep.join(strs)
62
+ return result
63
+
64
+
65
+ class Recipe:
66
+ """The internal representation of a recipe"""
67
+ filters: List[ToCFilter]
68
+
69
+ def __init__(self, recipe_dict: dict):
70
+ fltr_dicts = recipe_dict.get('heading', [])
71
+
72
+ if len(fltr_dicts) == 0:
73
+ raise ValueError("no filters found in recipe")
74
+ self.filters = [ToCFilter(fltr) for fltr in fltr_dicts]
75
+
76
+ def _extract_span(self, spn: dict) -> Optional[Fragment]:
77
+ """Extract text from span along with level
78
+
79
+ Argument
80
+ spn: a span dictionary
81
+ {
82
+ 'bbox': (float, float, float, float),
83
+ 'color': int,
84
+ 'flags': int,
85
+ 'font': str,
86
+ 'size': float,
87
+ 'text': str
88
+ }
89
+ Returns
90
+ a fragment of the heading or None if no match
91
+ """
92
+ for fltr in self.filters:
93
+ if fltr.admits(spn):
94
+ text = spn.get('text', "").strip()
95
+
96
+ if not text:
97
+ # don't match empty spaces
98
+ return None
99
+
100
+ if fltr.greedy:
101
+ # propagate all the way back to extract_block
102
+ raise FoundGreedy(fltr.level)
103
+
104
+ return Fragment(text, fltr.level)
105
+ return None
106
+
107
+ def _extract_line(self, line: dict) -> List[Optional[Fragment]]:
108
+ """Extract matching heading fragments in a line.
109
+
110
+ Argument
111
+ line: a line dictionary
112
+ {
113
+ 'bbox': (float, float, float, float),
114
+ 'wmode': int,
115
+ 'dir': (float, float),
116
+ 'spans': [dict]
117
+ }
118
+ Returns
119
+ a list of fragments concatenated from result in a line
120
+ """
121
+ return [self._extract_span(spn) for spn in line.get('spans', [])]
122
+
123
+ def extract_block(self, block: dict, page: int) -> List[ToCEntry]:
124
+ """Extract matching headings in a block.
125
+
126
+ Argument
127
+ block: a block dictionary
128
+ {
129
+ 'bbox': (float, float, float, float),
130
+ 'lines': [dict],
131
+ 'type': int
132
+ }
133
+ Returns
134
+ a list of toc entries, concatenated from the result of lines
135
+ """
136
+ if block.get('type') != 0:
137
+ # not a text block
138
+ return []
139
+
140
+ vpos = block.get('bbox', (0, 0))[1]
141
+
142
+ try:
143
+ frags = chain.from_iterable([
144
+ self._extract_line(ln) for ln in block.get('lines')
145
+ ])
146
+ titles = concatFrag(frags)
147
+
148
+ return [
149
+ ToCEntry(level, title, page, vpos)
150
+ for level, title in titles.items()
151
+ ]
152
+ except FoundGreedy as e:
153
+ # Smart Greedy: Only merged text that MATCHES the filter
154
+ # Find the filter that triggered this level
155
+ relevant_filter = next((f for f in self.filters if f.level == e.level), None)
156
+
157
+ parts = []
158
+ if relevant_filter:
159
+ for ln in block.get('lines', []):
160
+ for spn in ln.get('spans', []):
161
+ if relevant_filter.admits(spn):
162
+ parts.append(spn.get('text', "").strip())
163
+
164
+ merged_text = " ".join(parts)
165
+ if merged_text:
166
+ return [ToCEntry(e.level, merged_text, page, vpos)]
167
+ else:
168
+ return []
169
+
170
+
171
+ def extract_toc(doc: Document, recipe: Recipe) -> List[ToCEntry]:
172
+ """Extract toc entries from a document
173
+
174
+ Arguments
175
+ doc: a pdf document
176
+ recipe: recipe from user
177
+ Returns
178
+ a list of toc entries in the document
179
+ """
180
+ result = []
181
+
182
+ for page in doc.pages():
183
+ for blk in page.get_textpage().extractDICT().get('blocks', []):
184
+ result.extend(
185
+ recipe.extract_block(blk, page.number + 1)
186
+ )
187
+
188
+ return result
pdftocgen/tocgen.py ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from fitz import Document
2
+ from typing import List
3
+ from fitzutils import ToCEntry
4
+ from .recipe import Recipe, extract_toc
5
+
6
+ def gen_toc(doc: Document, recipe_dict: dict) -> List[ToCEntry]:
7
+ """Generate the table of content for a document from recipe
8
+
9
+ Argument
10
+ doc: a pdf document
11
+ recipe_dict: the recipe dictionary used to generate the toc
12
+ Returns
13
+ a list of ToC entries
14
+ """
15
+ return extract_toc(doc, Recipe(recipe_dict))
pdftocio/__init__.py ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ """Manipulating the table of contents of a pdf"""
2
+
3
+ __version__ = '1.3.4'
pdftocio/__main__.py ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ from .app import main
2
+
3
+ if __name__ == '__main__':
4
+ main()
pdftocio/app.py ADDED
@@ -0,0 +1,184 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """The executable of pdftocio"""
2
+
3
+ import sys
4
+ import os.path
5
+ import pdftocio
6
+ import getopt
7
+ import io
8
+
9
+ from typing import Optional, TextIO
10
+ from getopt import GetoptError
11
+ from fitzutils import open_pdf, dump_toc, pprint_toc, get_file_encoding
12
+ from .tocparser import parse_toc
13
+ from .tocio import write_toc, read_toc
14
+
15
+ usage_s = """
16
+ usage: pdftocio [options] in.pdf < toc
17
+ pdftocio [options] in.pdf
18
+ """.strip()
19
+
20
+ help_s = r"""
21
+ usage: pdftocio [options] in.pdf < toc
22
+ pdftocio [options] in.pdf
23
+
24
+ Import/output the table of contents of a PDF file.
25
+
26
+ This command can operate in two ways: it can either be used
27
+ to extract the table of contents of a PDF, or import table
28
+ of contents to a PDF using the output of pdftocgen.
29
+
30
+ 1. To extract the table of contents of a PDF for
31
+ modification, only supply a input file:
32
+
33
+ $ pdftocio in.pdf
34
+
35
+ or if you want to print it in a readable format, use the
36
+ -H flag:
37
+
38
+ $ pdftocio -H in.pdf
39
+
40
+ 2. To import a table of contents to a PDF using the toc file
41
+ generated by pdftocgen, use input redirection,
42
+
43
+ $ pdftocio in.pdf < toc
44
+
45
+ pipes,
46
+
47
+ $ pdftocgen -r recipe.toml in.pdf | pdftocio in.pdf
48
+
49
+ or the -t flag
50
+
51
+ $ pdftocio -t toc in.pdf
52
+
53
+ to supply the toc file. If you want to specify an output
54
+ file name, use the -o option
55
+
56
+ $ pdftocio -t toc -o out.pdf in.pdf
57
+
58
+ arguments
59
+ in.pdf path to the input PDF document
60
+
61
+ options
62
+ -h, --help show help
63
+ -t, --toc=toc path to the table of contents generated by
64
+ pdftocgen. if this option is not given, the
65
+ default is stdin, but if no input is piped or
66
+ redirected to stdin, this program will instead
67
+ print the existing ToC of the PDF file
68
+ -v, --vpos if this flag is set, the vertical position of
69
+ each heading will be dumped to the output
70
+ -p, --print when flag is set, print the existing ToC in
71
+ the input PDF file. this flag is usually not
72
+ necessary, since it is the default behavior
73
+ when no input is given
74
+ -H, --human-readable print the toc in a readable format
75
+ -o, --out=file.pdf path to the output file. if this flag is not
76
+ specified, the default is {input}_out.pdf
77
+ -g, --debug enable debug mode
78
+ -V, --version show version number
79
+
80
+ [1]: https://krasjet.com/voice/pdf.tocgen/#step-1-build-a-recipe
81
+ """.strip()
82
+
83
+
84
+ def main():
85
+ # parse arguments
86
+ try:
87
+ opts, args = getopt.gnu_getopt(
88
+ sys.argv[1:],
89
+ "hvt:pHo:gV",
90
+ ["help", "vpos", "toc=", "print", "human-readable", "out=", "debug", "version"]
91
+ )
92
+ except GetoptError as e:
93
+ print(e, file=sys.stderr)
94
+ print(usage_s, file=sys.stderr)
95
+ sys.exit(2)
96
+
97
+ toc_file: TextIO = io.TextIOWrapper(sys.stdin.buffer, encoding='utf-8', errors='ignore')
98
+ print_toc: bool = False
99
+ readable: bool = False
100
+ out: Optional[str] = None
101
+ vpos: bool = False
102
+ debug: bool = False
103
+
104
+ for o, a in opts:
105
+ if o in ("-H", "--human-readable"):
106
+ readable = True
107
+ elif o in ("-p", "--print"):
108
+ print_toc = True
109
+ elif o in ("-v", "--vpos"):
110
+ vpos = True
111
+ elif o in ("-t", "--toc"):
112
+ try:
113
+ toc_file = open(a, "r", encoding=get_file_encoding(a))
114
+ except IOError as e:
115
+ print("error: can't open file for reading", file=sys.stderr)
116
+ print(e, file=sys.stderr)
117
+ sys.exit(1)
118
+ elif o in ("-o", "--out"):
119
+ out = a
120
+ elif o in ("-g", "--debug"):
121
+ debug = True
122
+ elif o in ("-V", "--version"):
123
+ print("pdftocio", pdftocio.__version__, file=sys.stderr)
124
+ sys.exit()
125
+ elif o in ("-h", "--help"):
126
+ print(help_s, file=sys.stderr)
127
+ sys.exit()
128
+
129
+ if len(args) < 1:
130
+ print("error: no input pdf is given", file=sys.stderr)
131
+ print(usage_s, file=sys.stderr)
132
+ sys.exit(1)
133
+
134
+ path_in: str = args[0]
135
+ # done parsing arguments
136
+
137
+ try:
138
+ with open_pdf(path_in) as doc:
139
+ if toc_file.isatty() or print_toc:
140
+ # no input from user, switch to output mode and extract the toc
141
+ # of pdf
142
+ toc = read_toc(doc)
143
+ if len(toc) == 0:
144
+ print("error: no table of contents found", file=sys.stderr)
145
+ sys.exit(1)
146
+
147
+ stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8', errors='ignore')
148
+
149
+ if readable:
150
+ print(pprint_toc(toc), file=stdout)
151
+ else:
152
+ print(dump_toc(toc, vpos), end="", file=stdout)
153
+ sys.exit(0)
154
+
155
+ # an input is given, so switch to input mode
156
+ toc = parse_toc(toc_file)
157
+ write_toc(doc, toc)
158
+
159
+ if out is None:
160
+ # add suffix to input name as output
161
+ pfx, ext = os.path.splitext(path_in)
162
+ out = f"{pfx}_out{ext}"
163
+ doc.save(out)
164
+ except ValueError as e:
165
+ if debug:
166
+ raise e
167
+ print("error:", e, file=sys.stderr)
168
+ sys.exit(1)
169
+ except IOError as e:
170
+ if debug:
171
+ raise e
172
+ print("error: unable to open file", file=sys.stderr)
173
+ print(e, file=sys.stderr)
174
+ sys.exit(1)
175
+ except IndexError as e:
176
+ if debug:
177
+ raise e
178
+ print("index error:", e, file=sys.stderr)
179
+ sys.exit(1)
180
+ except KeyboardInterrupt as e:
181
+ if debug:
182
+ raise e
183
+ print("error: interrupted", file=sys.stderr)
184
+ sys.exit(1)
pdftocio/tocio.py ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Reading and writing table of contents from/to a pdf"""
2
+
3
+ from typing import List
4
+ from fitz import Document
5
+ from fitzutils import ToCEntry
6
+
7
+
8
+ def write_toc(doc: Document, toc: List[ToCEntry]):
9
+ """Write table of contents to a document"""
10
+ fitz_toc = list(map(lambda e: e.to_fitz_entry(), toc))
11
+ doc.set_toc(fitz_toc)
12
+
13
+
14
+ def read_toc(doc: Document) -> List[ToCEntry]:
15
+ """Read table of contents from a document"""
16
+ return [
17
+ ToCEntry(e[0], e[1], e[2], e[3]['to'].y) if (len(e) == 4 and 'to' in e[3]) else
18
+ ToCEntry(e[0], e[1], e[2])
19
+ for e in doc.get_toc(False)
20
+ ]
pdftocio/tocparser.py ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Parser for table of content csv file"""
2
+
3
+ import csv
4
+ import sys
5
+
6
+ from typing import IO, List
7
+ from fitzutils import ToCEntry
8
+ from itertools import takewhile
9
+
10
+
11
+ def parse_entry(entry: List) -> ToCEntry:
12
+ """parse a row in csv to a toc entry"""
13
+
14
+ # a somewhat weird hack, csv reader would read spaces as an empty '', so we
15
+ # only need to count the number of '' before an entry to determined the
16
+ # heading level
17
+ indent = len(list(takewhile(lambda x: x == '', entry)))
18
+ try:
19
+ toc_entry = ToCEntry(
20
+ int(indent / 4) + 1, # 4 spaces = 1 level
21
+ entry[indent], # heading
22
+ int(entry[indent + 1]), # pagenum
23
+ *entry[indent + 2:] # vpos
24
+ )
25
+ return toc_entry
26
+ except IndexError as e:
27
+ print(f"Unable to parse toc entry {entry};",
28
+ f"Need at least {indent + 2} parts but only have {len(entry)}.",
29
+ "Make sure the page number is present.",
30
+ file=sys.stderr)
31
+ raise e
32
+
33
+
34
+ def parse_toc(file: IO) -> List[ToCEntry]:
35
+ """Parse a toc file to a list of toc entries"""
36
+ reader = csv.reader(file, lineterminator='\n',
37
+ delimiter=' ', quoting=csv.QUOTE_NONNUMERIC)
38
+ return list(map(parse_entry, reader))
pdfxmeta/__init__.py ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ """Extract metadata (fonts, bounding box) for a string in a pdf"""
2
+
3
+ __version__ = '1.3.4'
4
+
5
+ from .pdfxmeta import extract_meta, dump_meta, dump_toml
pdfxmeta/__main__.py ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ from .app import main
2
+
3
+ if __name__ == '__main__':
4
+ main()
pdfxmeta/app.py ADDED
@@ -0,0 +1,147 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """The executable of pdfxmeta"""
2
+
3
+ import getopt
4
+ import sys
5
+ import pdfxmeta
6
+ import io
7
+
8
+ from getopt import GetoptError
9
+ from typing import Optional, TextIO
10
+ from fitzutils import open_pdf
11
+ from textwrap import indent
12
+ from pdfxmeta import dump_meta, dump_toml, extract_meta
13
+
14
+
15
+ usage_s = """
16
+ usage: pdfxmeta [options] doc.pdf [pattern]
17
+ """.strip()
18
+
19
+ help_s = """
20
+ usage: pdfxmeta [options] doc.pdf [pattern]
21
+
22
+ Extract the metadata for pattern in doc.pdf.
23
+
24
+ To use this command, first open up the pdf file with your
25
+ favorite pdf reader and find the text you want to search
26
+ for. Then use
27
+
28
+ $ pdfxmeta -p 1 in.pdf "Subsection One"
29
+
30
+ to find the metadata, mainly the font attributes and
31
+ bounding box, of lines containing the pattern "Subsection
32
+ One" on page 1. Specifying a page number is optional but
33
+ highly recommended, since it greatly reduces the ambiguity
34
+ of matches and execution time.
35
+
36
+ The output of this command can be directly copy-pasted to
37
+ build a recipe file for pdftocgen. Alternatively, you could
38
+ also use the --auto or -a flag to output a valid heading
39
+ filter directly
40
+
41
+ $ pdfxmeta -p 1 -a 2 in.pdf "Subsection One" >> recipe.toml
42
+
43
+ where the argument of -a is the level of the heading filter,
44
+ which in this case is 2.
45
+
46
+ arguments
47
+ doc.pdf path to the input PDF document
48
+ [pattern] the pattern to search for (python regex). if not
49
+ given, dump the entire document
50
+
51
+ options
52
+ -h, --help show help
53
+ -p, --page=PAGE specify the page to search for (1-based index)
54
+ -i, --ignore-case when flag is set, search will be case-insensitive
55
+ -a, --auto=LEVEL when flag is set, the output would be a valid
56
+ heading filter of the specified heading level in
57
+ default settings. it is directly usable by
58
+ pdftocgen.
59
+ -o, --out=FILE path to the output file. if this flag is not
60
+ specified, the default is stdout
61
+ -V, --version show version number
62
+ """.strip()
63
+
64
+
65
+ def print_result(meta: dict) -> str:
66
+ """pretty print results in a structured manner"""
67
+ return f"{meta.get('text', '')}:\n{indent(dump_meta(meta), ' ')}"
68
+
69
+
70
+ def main():
71
+ # parse arguments
72
+ try:
73
+ opts, args = getopt.gnu_getopt(
74
+ sys.argv[1:],
75
+ "hiVp:a:o:",
76
+ ["help", "ignore-case", "version", "page=", "auto=", "out="]
77
+ )
78
+ except GetoptError as e:
79
+ print(e, file=sys.stderr)
80
+ print(usage_s, file=sys.stderr)
81
+ sys.exit(2)
82
+
83
+ ignore_case: bool = False
84
+ page: Optional[int] = None
85
+ auto_level: Optional[int] = None
86
+ out: TextIO = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8', errors='ignore')
87
+
88
+ for o, a in opts:
89
+ if o in ("-i", "--ignore-case"):
90
+ ignore_case = True
91
+ elif o in ("-p", "--page"):
92
+ try:
93
+ page = int(a)
94
+ except ValueError as e:
95
+ print("error: invalid page number", file=sys.stderr)
96
+ sys.exit(1)
97
+ elif o in ("-a", "--auto"):
98
+ try:
99
+ auto_level = int(a)
100
+ except ValueError as e:
101
+ print("error: invalid level", file=sys.stderr)
102
+ sys.exit(1)
103
+ elif o in ("-o", "--out"):
104
+ try:
105
+ out = open(a, "w", encoding='utf-8', errors='ignore')
106
+ except IOError as e:
107
+ print("error: can't open file for writing", file=sys.stderr)
108
+ print(e, file=sys.stderr)
109
+ sys.exit(1)
110
+ elif o in ("-V", "--version"):
111
+ print("pdfxmeta", pdfxmeta.__version__, file=sys.stderr)
112
+ sys.exit()
113
+ elif o in ("-h", "--help"):
114
+ print(help_s, file=sys.stderr)
115
+ sys.exit()
116
+
117
+ argc = len(args)
118
+
119
+ if argc < 1:
120
+ print("error: no input pdf is given", file=sys.stderr)
121
+ print(usage_s, file=sys.stderr)
122
+ sys.exit(1)
123
+
124
+ path_in: str = args[0]
125
+ pattern: str = ""
126
+
127
+ if argc >= 2:
128
+ pattern = args[1]
129
+
130
+ # done parsing arguments
131
+
132
+ with open_pdf(path_in) as doc:
133
+ meta = extract_meta(doc, pattern, page, ignore_case)
134
+
135
+ # nothing found
136
+ if len(meta) == 0:
137
+ sys.exit(1)
138
+
139
+ # should we add \n between each output?
140
+ addnl = not out.isatty()
141
+
142
+ if auto_level:
143
+ print('\n'.join(
144
+ [dump_toml(m, auto_level, addnl) for m in meta]
145
+ ), file=out)
146
+ else:
147
+ print('\n'.join(map(print_result, meta)), file=out)
pdfxmeta/pdfxmeta.py ADDED
@@ -0,0 +1,194 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Extract metadata for a string in a pdf file"""
2
+
3
+ from toml.encoder import _dump_str, _dump_float
4
+
5
+ import re
6
+
7
+ from fitz import Document, Page
8
+ from typing import Optional, List
9
+
10
+
11
+ def extract_meta(doc: Document,
12
+ pattern: str,
13
+ page: Optional[int] = None,
14
+ ign_case: bool = False
15
+ ) -> List[dict]:
16
+ """Extract meta for a `pattern` on `page` in a pdf document
17
+
18
+ Arguments
19
+ doc: document from pymupdf
20
+ pattern: a regular expression pattern
21
+ page: page number (1-based index), if None is given, search for the
22
+ entire document, but this is highly discouraged.
23
+ ign_case: ignore case?
24
+ """
25
+ result = []
26
+
27
+ if page is None:
28
+ pages = doc.pages()
29
+ elif 1 <= page <= doc.page_count:
30
+ pages = [doc[page - 1]]
31
+ else: # page out of range
32
+ return result
33
+
34
+ regex = re.compile(
35
+ pattern,
36
+ re.IGNORECASE
37
+ ) if ign_case else re.compile(pattern)
38
+
39
+ # we could parallelize this, but I don't see a reason
40
+ # to *not* specify a page number
41
+ for p in pages:
42
+ found = search_in_page(regex, p)
43
+ for s in found:
44
+ s['page_index'] = p.number + 1
45
+ try:
46
+ s['page_label'] = p.get_label()
47
+ except Exception:
48
+ # Fallback if get_label fails due to PyMuPDF version issues
49
+ s['page_label'] = ""
50
+ result.extend(found)
51
+
52
+ return result
53
+
54
+
55
+ def search_in_page(regex: re.Pattern, page: Page) -> List[dict]:
56
+ """Search for `text` in `page` and extract meta using optimized search_for"""
57
+ result = []
58
+
59
+ # 1. Use simple string search if regex is just a literal (optimization)
60
+ # But since we have a compiled regex, we might need to extract the pattern if it's simple
61
+ # Or just use the regex to find matches in the FULL text of the page first?
62
+ # PyMuPDF's search_for takes a string. It doesn't support regex directly in wrapped core.
63
+ # However, for the purpose of this tool which claims regex support, we have a dilemma.
64
+ # But most users searching "Chapter 1" are doing literal searches.
65
+
66
+ # If we want to support the user's "Divided World", we need to handle the case where it might be split.
67
+ # The most robust way for PDF text search is usually:
68
+ # 1. Get all text (with position).
69
+ # 2. Run regex on the full text.
70
+ # 3. Map match back to bbox.
71
+ # 4. Find spans in bbox.
72
+
73
+ # BUT, to keep it simple and fix the immediate "spinning" and "missing" issue:
74
+ # The previous code iterated every span.
75
+ # Let's try to be smarter.
76
+
77
+ # For now, let's assume the user pattern is often a literal or we can approximate it.
78
+ # If the user provides a regex, we can't easily use search_for.
79
+ # However, the user provided "Divided World".
80
+
81
+ # Let's fallback to the robust get_text("dict") but optimize the check?
82
+ # No, get_text("dict") IS the slow part.
83
+
84
+ # Alternative:
85
+ # Use page.get_text("text") -> run regex -> if match, THEN get_text("dict")?
86
+ # That saves time for pages that DON'T match.
87
+
88
+ # Improved Algorithm:
89
+ # 1. Extract plain text of the page.
90
+ # 2. If regex doesn't match plain text, SKIP the page. (Huge optimization)
91
+ # 3. If it does match, perform the detailed span search.
92
+
93
+ text_content = page.get_text()
94
+ if not regex.search(text_content):
95
+ return []
96
+
97
+ # If we are here, there is a match on this page. Now find the exact spans.
98
+ # Note: If the text is split across spans, the simple span iterator below will STILL fail to extract the specific span metadata for the *whole* match.
99
+ # But at least it won't spin on empty pages.
100
+
101
+ page_meta = page.get_textpage().extractDICT()
102
+
103
+ for blk in page_meta.get('blocks', []):
104
+ for ln in blk.get('lines', []):
105
+ for spn in ln.get('spans', []):
106
+ text = spn.get('text', "")
107
+ if regex.search(text):
108
+ result.append(spn)
109
+ return result
110
+
111
+
112
+ def to_bools(var: int) -> str:
113
+ """Convert int to lowercase bool string"""
114
+ return str(var != 0).lower()
115
+
116
+
117
+ def dump_meta(spn: dict) -> str:
118
+ """Dump the span dict from PyMuPDF to TOML compatible string"""
119
+ result = []
120
+
121
+ if 'page_index' in spn:
122
+ result.append(f"page.index = {spn['page_index']}")
123
+ if 'page_label' in spn:
124
+ result.append(f"page.label = \"{spn['page_label']}\"")
125
+
126
+ result.append(f"font.name = {_dump_str(spn['font'])}")
127
+ result.append(f"font.size = {_dump_float(spn['size'])}")
128
+ result.append(f"font.color = {spn['color']:#08x}")
129
+
130
+ flags = spn['flags']
131
+
132
+ result.append(f"font.superscript = {to_bools(flags & 0b00001)}")
133
+ result.append(f"font.italic = {to_bools(flags & 0b00010)}")
134
+ result.append(f"font.serif = {to_bools(flags & 0b00100)}")
135
+ result.append(f"font.monospace = {to_bools(flags & 0b01000)}")
136
+ result.append(f"font.bold = {to_bools(flags & 0b10000)}")
137
+
138
+ bbox = spn['bbox']
139
+
140
+ result.append(f"bbox.left = {_dump_float(bbox[0])}")
141
+ result.append(f"bbox.top = {_dump_float(bbox[1])}")
142
+ result.append(f"bbox.right = {_dump_float(bbox[2])}")
143
+ result.append(f"bbox.bottom = {_dump_float(bbox[3])}")
144
+
145
+ return '\n'.join(result)
146
+
147
+
148
+ def dump_toml(spn: dict, level: int, trail_nl: bool = False) -> str:
149
+ """Dump a valid TOML directly usable by pdftocgen
150
+
151
+ Argument
152
+ spn: span dict of the heading
153
+ level: heading level
154
+ trail_nl: add trailing new line
155
+ Returns
156
+ a valid toml string
157
+ """
158
+ result = []
159
+
160
+ result.append("[[heading]]")
161
+ result.append(f"# {spn.get('text', '')}")
162
+ result.append(f"level = {level}")
163
+ result.append("greedy = true")
164
+
165
+ # strip font subset prefix
166
+ # == takeWhile (\c -> c /= '+') str
167
+ before, sep, after = spn['font'].partition('+')
168
+ font = after if sep else before
169
+
170
+ result.append(f"font.name = {_dump_str(font)}")
171
+ result.append(f"font.size = {_dump_float(spn['size'])}")
172
+ result.append("# font.size_tolerance = 1e-5")
173
+ result.append(f"# font.color = {spn['color']:#08x}")
174
+
175
+ flags = spn['flags']
176
+
177
+ result.append(f"# font.superscript = {to_bools(flags & 0b00001)}")
178
+ result.append(f"# font.italic = {to_bools(flags & 0b00010)}")
179
+ result.append(f"# font.serif = {to_bools(flags & 0b00100)}")
180
+ result.append(f"# font.monospace = {to_bools(flags & 0b01000)}")
181
+ result.append(f"# font.bold = {to_bools(flags & 0b10000)}")
182
+
183
+ bbox = spn['bbox']
184
+
185
+ result.append(f"# bbox.left = {_dump_float(bbox[0])}")
186
+ result.append(f"# bbox.top = {_dump_float(bbox[1])}")
187
+ result.append(f"# bbox.right = {_dump_float(bbox[2])}")
188
+ result.append(f"# bbox.bottom = {_dump_float(bbox[3])}")
189
+ result.append("# bbox.tolerance = 1e-5")
190
+
191
+ if trail_nl:
192
+ result.append("")
193
+
194
+ return '\n'.join(result)
poetry.lock ADDED
@@ -0,0 +1,534 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # This file is automatically @generated by Poetry 1.4.2 and should not be changed by hand.
2
+
3
+ [[package]]
4
+ name = "args"
5
+ version = "0.1.0"
6
+ description = "Command Arguments for Humans."
7
+ category = "dev"
8
+ optional = false
9
+ python-versions = "*"
10
+ files = [
11
+ {file = "args-0.1.0.tar.gz", hash = "sha256:a785b8d837625e9b61c39108532d95b85274acd679693b71ebb5156848fcf814"},
12
+ ]
13
+
14
+ [[package]]
15
+ name = "astroid"
16
+ version = "2.11.7"
17
+ description = "An abstract syntax tree for Python with inference support."
18
+ category = "dev"
19
+ optional = false
20
+ python-versions = ">=3.6.2"
21
+ files = [
22
+ {file = "astroid-2.11.7-py3-none-any.whl", hash = "sha256:86b0a340a512c65abf4368b80252754cda17c02cdbbd3f587dddf98112233e7b"},
23
+ {file = "astroid-2.11.7.tar.gz", hash = "sha256:bb24615c77f4837c707669d16907331374ae8a964650a66999da3f5ca68dc946"},
24
+ ]
25
+
26
+ [package.dependencies]
27
+ lazy-object-proxy = ">=1.4.0"
28
+ setuptools = ">=20.0"
29
+ typed-ast = {version = ">=1.4.0,<2.0", markers = "implementation_name == \"cpython\" and python_version < \"3.8\""}
30
+ typing-extensions = {version = ">=3.10", markers = "python_version < \"3.10\""}
31
+ wrapt = ">=1.11,<2"
32
+
33
+ [[package]]
34
+ name = "chardet"
35
+ version = "5.1.0"
36
+ description = "Universal encoding detector for Python 3"
37
+ category = "main"
38
+ optional = false
39
+ python-versions = ">=3.7"
40
+ files = [
41
+ {file = "chardet-5.1.0-py3-none-any.whl", hash = "sha256:362777fb014af596ad31334fde1e8c327dfdb076e1960d1694662d46a6917ab9"},
42
+ {file = "chardet-5.1.0.tar.gz", hash = "sha256:0d62712b956bc154f85fb0a266e2a3c5913c2967e00348701b32411d6def31e5"},
43
+ ]
44
+
45
+ [[package]]
46
+ name = "clint"
47
+ version = "0.5.1"
48
+ description = "Python Command Line Interface Tools"
49
+ category = "dev"
50
+ optional = false
51
+ python-versions = "*"
52
+ files = [
53
+ {file = "clint-0.5.1.tar.gz", hash = "sha256:05224c32b1075563d0b16d0015faaf9da43aa214e4a2140e51f08789e7a4c5aa"},
54
+ ]
55
+
56
+ [package.dependencies]
57
+ args = "*"
58
+
59
+ [[package]]
60
+ name = "colorama"
61
+ version = "0.4.6"
62
+ description = "Cross-platform colored terminal text."
63
+ category = "dev"
64
+ optional = false
65
+ python-versions = "!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*,!=3.5.*,!=3.6.*,>=2.7"
66
+ files = [
67
+ {file = "colorama-0.4.6-py2.py3-none-any.whl", hash = "sha256:4f1d9991f5acc0ca119f9d443620b77f9d6b33703e51011c16baf57afb285fc6"},
68
+ {file = "colorama-0.4.6.tar.gz", hash = "sha256:08695f5cb7ed6e0531a20572697297273c47b8cae5a63ffc6d6ed5c201be6e44"},
69
+ ]
70
+
71
+ [[package]]
72
+ name = "coverage"
73
+ version = "7.2.3"
74
+ description = "Code coverage measurement for Python"
75
+ category = "dev"
76
+ optional = false
77
+ python-versions = ">=3.7"
78
+ files = [
79
+ {file = "coverage-7.2.3-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:e58c0d41d336569d63d1b113bd573db8363bc4146f39444125b7f8060e4e04f5"},
80
+ {file = "coverage-7.2.3-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:344e714bd0fe921fc72d97404ebbdbf9127bac0ca1ff66d7b79efc143cf7c0c4"},
81
+ {file = "coverage-7.2.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:974bc90d6f6c1e59ceb1516ab00cf1cdfbb2e555795d49fa9571d611f449bcb2"},
82
+ {file = "coverage-7.2.3-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:0743b0035d4b0e32bc1df5de70fba3059662ace5b9a2a86a9f894cfe66569013"},
83
+ {file = "coverage-7.2.3-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:5d0391fb4cfc171ce40437f67eb050a340fdbd0f9f49d6353a387f1b7f9dd4fa"},
84
+ {file = "coverage-7.2.3-cp310-cp310-musllinux_1_1_aarch64.whl", hash = "sha256:4a42e1eff0ca9a7cb7dc9ecda41dfc7cbc17cb1d02117214be0561bd1134772b"},
85
+ {file = "coverage-7.2.3-cp310-cp310-musllinux_1_1_i686.whl", hash = "sha256:be19931a8dcbe6ab464f3339966856996b12a00f9fe53f346ab3be872d03e257"},
86
+ {file = "coverage-7.2.3-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:72fcae5bcac3333a4cf3b8f34eec99cea1187acd55af723bcbd559adfdcb5535"},
87
+ {file = "coverage-7.2.3-cp310-cp310-win32.whl", hash = "sha256:aeae2aa38395b18106e552833f2a50c27ea0000122bde421c31d11ed7e6f9c91"},
88
+ {file = "coverage-7.2.3-cp310-cp310-win_amd64.whl", hash = "sha256:83957d349838a636e768251c7e9979e899a569794b44c3728eaebd11d848e58e"},
89
+ {file = "coverage-7.2.3-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:dfd393094cd82ceb9b40df4c77976015a314b267d498268a076e940fe7be6b79"},
90
+ {file = "coverage-7.2.3-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:182eb9ac3f2b4874a1f41b78b87db20b66da6b9cdc32737fbbf4fea0c35b23fc"},
91
+ {file = "coverage-7.2.3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:1bb1e77a9a311346294621be905ea8a2c30d3ad371fc15bb72e98bfcfae532df"},
92
+ {file = "coverage-7.2.3-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:ca0f34363e2634deffd390a0fef1aa99168ae9ed2af01af4a1f5865e362f8623"},
93
+ {file = "coverage-7.2.3-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:55416d7385774285b6e2a5feca0af9652f7f444a4fa3d29d8ab052fafef9d00d"},
94
+ {file = "coverage-7.2.3-cp311-cp311-musllinux_1_1_aarch64.whl", hash = "sha256:06ddd9c0249a0546997fdda5a30fbcb40f23926df0a874a60a8a185bc3a87d93"},
95
+ {file = "coverage-7.2.3-cp311-cp311-musllinux_1_1_i686.whl", hash = "sha256:fff5aaa6becf2c6a1699ae6a39e2e6fb0672c2d42eca8eb0cafa91cf2e9bd312"},
96
+ {file = "coverage-7.2.3-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:ea53151d87c52e98133eb8ac78f1206498c015849662ca8dc246255265d9c3c4"},
97
+ {file = "coverage-7.2.3-cp311-cp311-win32.whl", hash = "sha256:8f6c930fd70d91ddee53194e93029e3ef2aabe26725aa3c2753df057e296b925"},
98
+ {file = "coverage-7.2.3-cp311-cp311-win_amd64.whl", hash = "sha256:fa546d66639d69aa967bf08156eb8c9d0cd6f6de84be9e8c9819f52ad499c910"},
99
+ {file = "coverage-7.2.3-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:b2317d5ed777bf5a033e83d4f1389fd4ef045763141d8f10eb09a7035cee774c"},
100
+ {file = "coverage-7.2.3-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:be9824c1c874b73b96288c6d3de793bf7f3a597770205068c6163ea1f326e8b9"},
101
+ {file = "coverage-7.2.3-cp37-cp37m-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:2c3b2803e730dc2797a017335827e9da6da0e84c745ce0f552e66400abdfb9a1"},
102
+ {file = "coverage-7.2.3-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:8f69770f5ca1994cb32c38965e95f57504d3aea96b6c024624fdd5bb1aa494a1"},
103
+ {file = "coverage-7.2.3-cp37-cp37m-musllinux_1_1_aarch64.whl", hash = "sha256:1127b16220f7bfb3f1049ed4a62d26d81970a723544e8252db0efde853268e21"},
104
+ {file = "coverage-7.2.3-cp37-cp37m-musllinux_1_1_i686.whl", hash = "sha256:aa784405f0c640940595fa0f14064d8e84aff0b0f762fa18393e2760a2cf5841"},
105
+ {file = "coverage-7.2.3-cp37-cp37m-musllinux_1_1_x86_64.whl", hash = "sha256:3146b8e16fa60427e03884301bf8209221f5761ac754ee6b267642a2fd354c48"},
106
+ {file = "coverage-7.2.3-cp37-cp37m-win32.whl", hash = "sha256:1fd78b911aea9cec3b7e1e2622c8018d51c0d2bbcf8faaf53c2497eb114911c1"},
107
+ {file = "coverage-7.2.3-cp37-cp37m-win_amd64.whl", hash = "sha256:0f3736a5d34e091b0a611964c6262fd68ca4363df56185902528f0b75dbb9c1f"},
108
+ {file = "coverage-7.2.3-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:981b4df72c93e3bc04478153df516d385317628bd9c10be699c93c26ddcca8ab"},
109
+ {file = "coverage-7.2.3-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:c0045f8f23a5fb30b2eb3b8a83664d8dc4fb58faddf8155d7109166adb9f2040"},
110
+ {file = "coverage-7.2.3-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:f760073fcf8f3d6933178d67754f4f2d4e924e321f4bb0dcef0424ca0215eba1"},
111
+ {file = "coverage-7.2.3-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:c86bd45d1659b1ae3d0ba1909326b03598affbc9ed71520e0ff8c31a993ad911"},
112
+ {file = "coverage-7.2.3-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:172db976ae6327ed4728e2507daf8a4de73c7cc89796483e0a9198fd2e47b462"},
113
+ {file = "coverage-7.2.3-cp38-cp38-musllinux_1_1_aarch64.whl", hash = "sha256:d2a3a6146fe9319926e1d477842ca2a63fe99af5ae690b1f5c11e6af074a6b5c"},
114
+ {file = "coverage-7.2.3-cp38-cp38-musllinux_1_1_i686.whl", hash = "sha256:f649dd53833b495c3ebd04d6eec58479454a1784987af8afb77540d6c1767abd"},
115
+ {file = "coverage-7.2.3-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:7c4ed4e9f3b123aa403ab424430b426a1992e6f4c8fd3cb56ea520446e04d152"},
116
+ {file = "coverage-7.2.3-cp38-cp38-win32.whl", hash = "sha256:eb0edc3ce9760d2f21637766c3aa04822030e7451981ce569a1b3456b7053f22"},
117
+ {file = "coverage-7.2.3-cp38-cp38-win_amd64.whl", hash = "sha256:63cdeaac4ae85a179a8d6bc09b77b564c096250d759eed343a89d91bce8b6367"},
118
+ {file = "coverage-7.2.3-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:20d1a2a76bb4eb00e4d36b9699f9b7aba93271c9c29220ad4c6a9581a0320235"},
119
+ {file = "coverage-7.2.3-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:4ea748802cc0de4de92ef8244dd84ffd793bd2e7be784cd8394d557a3c751e21"},
120
+ {file = "coverage-7.2.3-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:21b154aba06df42e4b96fc915512ab39595105f6c483991287021ed95776d934"},
121
+ {file = "coverage-7.2.3-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:fd214917cabdd6f673a29d708574e9fbdb892cb77eb426d0eae3490d95ca7859"},
122
+ {file = "coverage-7.2.3-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:2c2e58e45fe53fab81f85474e5d4d226eeab0f27b45aa062856c89389da2f0d9"},
123
+ {file = "coverage-7.2.3-cp39-cp39-musllinux_1_1_aarch64.whl", hash = "sha256:87ecc7c9a1a9f912e306997ffee020297ccb5ea388421fe62a2a02747e4d5539"},
124
+ {file = "coverage-7.2.3-cp39-cp39-musllinux_1_1_i686.whl", hash = "sha256:387065e420aed3c71b61af7e82c7b6bc1c592f7e3c7a66e9f78dd178699da4fe"},
125
+ {file = "coverage-7.2.3-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:ea3f5bc91d7d457da7d48c7a732beaf79d0c8131df3ab278e6bba6297e23c6c4"},
126
+ {file = "coverage-7.2.3-cp39-cp39-win32.whl", hash = "sha256:ae7863a1d8db6a014b6f2ff9c1582ab1aad55a6d25bac19710a8df68921b6e30"},
127
+ {file = "coverage-7.2.3-cp39-cp39-win_amd64.whl", hash = "sha256:3f04becd4fcda03c0160d0da9c8f0c246bc78f2f7af0feea1ec0930e7c93fa4a"},
128
+ {file = "coverage-7.2.3-pp37.pp38.pp39-none-any.whl", hash = "sha256:965ee3e782c7892befc25575fa171b521d33798132692df428a09efacaffe8d0"},
129
+ {file = "coverage-7.2.3.tar.gz", hash = "sha256:d298c2815fa4891edd9abe5ad6e6cb4207104c7dd9fd13aea3fdebf6f9b91259"},
130
+ ]
131
+
132
+ [package.extras]
133
+ toml = ["tomli"]
134
+
135
+ [[package]]
136
+ name = "dill"
137
+ version = "0.3.6"
138
+ description = "serialize all of python"
139
+ category = "dev"
140
+ optional = false
141
+ python-versions = ">=3.7"
142
+ files = [
143
+ {file = "dill-0.3.6-py3-none-any.whl", hash = "sha256:a07ffd2351b8c678dfc4a856a3005f8067aea51d6ba6c700796a4d9e280f39f0"},
144
+ {file = "dill-0.3.6.tar.gz", hash = "sha256:e5db55f3687856d8fbdab002ed78544e1c4559a130302693d839dfe8f93f2373"},
145
+ ]
146
+
147
+ [package.extras]
148
+ graph = ["objgraph (>=1.7.2)"]
149
+
150
+ [[package]]
151
+ name = "isort"
152
+ version = "5.11.5"
153
+ description = "A Python utility / library to sort Python imports."
154
+ category = "dev"
155
+ optional = false
156
+ python-versions = ">=3.7.0"
157
+ files = [
158
+ {file = "isort-5.11.5-py3-none-any.whl", hash = "sha256:ba1d72fb2595a01c7895a5128f9585a5cc4b6d395f1c8d514989b9a7eb2a8746"},
159
+ {file = "isort-5.11.5.tar.gz", hash = "sha256:6be1f76a507cb2ecf16c7cf14a37e41609ca082330be4e3436a18ef74add55db"},
160
+ ]
161
+
162
+ [package.extras]
163
+ colors = ["colorama (>=0.4.3,<0.5.0)"]
164
+ pipfile-deprecated-finder = ["pip-shims (>=0.5.2)", "pipreqs", "requirementslib"]
165
+ plugins = ["setuptools"]
166
+ requirements-deprecated-finder = ["pip-api", "pipreqs"]
167
+
168
+ [[package]]
169
+ name = "jedi"
170
+ version = "0.17.2"
171
+ description = "An autocompletion tool for Python that can be used for text editors."
172
+ category = "dev"
173
+ optional = false
174
+ python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*"
175
+ files = [
176
+ {file = "jedi-0.17.2-py2.py3-none-any.whl", hash = "sha256:98cc583fa0f2f8304968199b01b6b4b94f469a1f4a74c1560506ca2a211378b5"},
177
+ {file = "jedi-0.17.2.tar.gz", hash = "sha256:86ed7d9b750603e4ba582ea8edc678657fb4007894a12bcf6f4bb97892f31d20"},
178
+ ]
179
+
180
+ [package.dependencies]
181
+ parso = ">=0.7.0,<0.8.0"
182
+
183
+ [package.extras]
184
+ qa = ["flake8 (==3.7.9)"]
185
+ testing = ["Django (<3.1)", "colorama", "docopt", "pytest (>=3.9.0,<5.0.0)"]
186
+
187
+ [[package]]
188
+ name = "lazy-object-proxy"
189
+ version = "1.9.0"
190
+ description = "A fast and thorough lazy object proxy."
191
+ category = "dev"
192
+ optional = false
193
+ python-versions = ">=3.7"
194
+ files = [
195
+ {file = "lazy-object-proxy-1.9.0.tar.gz", hash = "sha256:659fb5809fa4629b8a1ac5106f669cfc7bef26fbb389dda53b3e010d1ac4ebae"},
196
+ {file = "lazy_object_proxy-1.9.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:b40387277b0ed2d0602b8293b94d7257e17d1479e257b4de114ea11a8cb7f2d7"},
197
+ {file = "lazy_object_proxy-1.9.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e8c6cfb338b133fbdbc5cfaa10fe3c6aeea827db80c978dbd13bc9dd8526b7d4"},
198
+ {file = "lazy_object_proxy-1.9.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:721532711daa7db0d8b779b0bb0318fa87af1c10d7fe5e52ef30f8eff254d0cd"},
199
+ {file = "lazy_object_proxy-1.9.0-cp310-cp310-musllinux_1_1_aarch64.whl", hash = "sha256:66a3de4a3ec06cd8af3f61b8e1ec67614fbb7c995d02fa224813cb7afefee701"},
200
+ {file = "lazy_object_proxy-1.9.0-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:1aa3de4088c89a1b69f8ec0dcc169aa725b0ff017899ac568fe44ddc1396df46"},
201
+ {file = "lazy_object_proxy-1.9.0-cp310-cp310-win32.whl", hash = "sha256:f0705c376533ed2a9e5e97aacdbfe04cecd71e0aa84c7c0595d02ef93b6e4455"},
202
+ {file = "lazy_object_proxy-1.9.0-cp310-cp310-win_amd64.whl", hash = "sha256:ea806fd4c37bf7e7ad82537b0757999264d5f70c45468447bb2b91afdbe73a6e"},
203
+ {file = "lazy_object_proxy-1.9.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:946d27deaff6cf8452ed0dba83ba38839a87f4f7a9732e8f9fd4107b21e6ff07"},
204
+ {file = "lazy_object_proxy-1.9.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:79a31b086e7e68b24b99b23d57723ef7e2c6d81ed21007b6281ebcd1688acb0a"},
205
+ {file = "lazy_object_proxy-1.9.0-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f699ac1c768270c9e384e4cbd268d6e67aebcfae6cd623b4d7c3bfde5a35db59"},
206
+ {file = "lazy_object_proxy-1.9.0-cp311-cp311-musllinux_1_1_aarch64.whl", hash = "sha256:bfb38f9ffb53b942f2b5954e0f610f1e721ccebe9cce9025a38c8ccf4a5183a4"},
207
+ {file = "lazy_object_proxy-1.9.0-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:189bbd5d41ae7a498397287c408617fe5c48633e7755287b21d741f7db2706a9"},
208
+ {file = "lazy_object_proxy-1.9.0-cp311-cp311-win32.whl", hash = "sha256:81fc4d08b062b535d95c9ea70dbe8a335c45c04029878e62d744bdced5141586"},
209
+ {file = "lazy_object_proxy-1.9.0-cp311-cp311-win_amd64.whl", hash = "sha256:f2457189d8257dd41ae9b434ba33298aec198e30adf2dcdaaa3a28b9994f6adb"},
210
+ {file = "lazy_object_proxy-1.9.0-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:d9e25ef10a39e8afe59a5c348a4dbf29b4868ab76269f81ce1674494e2565a6e"},
211
+ {file = "lazy_object_proxy-1.9.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:cbf9b082426036e19c6924a9ce90c740a9861e2bdc27a4834fd0a910742ac1e8"},
212
+ {file = "lazy_object_proxy-1.9.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:9f5fa4a61ce2438267163891961cfd5e32ec97a2c444e5b842d574251ade27d2"},
213
+ {file = "lazy_object_proxy-1.9.0-cp37-cp37m-musllinux_1_1_aarch64.whl", hash = "sha256:8fa02eaab317b1e9e03f69aab1f91e120e7899b392c4fc19807a8278a07a97e8"},
214
+ {file = "lazy_object_proxy-1.9.0-cp37-cp37m-musllinux_1_1_x86_64.whl", hash = "sha256:e7c21c95cae3c05c14aafffe2865bbd5e377cfc1348c4f7751d9dc9a48ca4bda"},
215
+ {file = "lazy_object_proxy-1.9.0-cp37-cp37m-win32.whl", hash = "sha256:f12ad7126ae0c98d601a7ee504c1122bcef553d1d5e0c3bfa77b16b3968d2734"},
216
+ {file = "lazy_object_proxy-1.9.0-cp37-cp37m-win_amd64.whl", hash = "sha256:edd20c5a55acb67c7ed471fa2b5fb66cb17f61430b7a6b9c3b4a1e40293b1671"},
217
+ {file = "lazy_object_proxy-1.9.0-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:2d0daa332786cf3bb49e10dc6a17a52f6a8f9601b4cf5c295a4f85854d61de63"},
218
+ {file = "lazy_object_proxy-1.9.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9cd077f3d04a58e83d04b20e334f678c2b0ff9879b9375ed107d5d07ff160171"},
219
+ {file = "lazy_object_proxy-1.9.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:660c94ea760b3ce47d1855a30984c78327500493d396eac4dfd8bd82041b22be"},
220
+ {file = "lazy_object_proxy-1.9.0-cp38-cp38-musllinux_1_1_aarch64.whl", hash = "sha256:212774e4dfa851e74d393a2370871e174d7ff0ebc980907723bb67d25c8a7c30"},
221
+ {file = "lazy_object_proxy-1.9.0-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:f0117049dd1d5635bbff65444496c90e0baa48ea405125c088e93d9cf4525b11"},
222
+ {file = "lazy_object_proxy-1.9.0-cp38-cp38-win32.whl", hash = "sha256:0a891e4e41b54fd5b8313b96399f8b0e173bbbfc03c7631f01efbe29bb0bcf82"},
223
+ {file = "lazy_object_proxy-1.9.0-cp38-cp38-win_amd64.whl", hash = "sha256:9990d8e71b9f6488e91ad25f322898c136b008d87bf852ff65391b004da5e17b"},
224
+ {file = "lazy_object_proxy-1.9.0-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:9e7551208b2aded9c1447453ee366f1c4070602b3d932ace044715d89666899b"},
225
+ {file = "lazy_object_proxy-1.9.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5f83ac4d83ef0ab017683d715ed356e30dd48a93746309c8f3517e1287523ef4"},
226
+ {file = "lazy_object_proxy-1.9.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:7322c3d6f1766d4ef1e51a465f47955f1e8123caee67dd641e67d539a534d006"},
227
+ {file = "lazy_object_proxy-1.9.0-cp39-cp39-musllinux_1_1_aarch64.whl", hash = "sha256:18b78ec83edbbeb69efdc0e9c1cb41a3b1b1ed11ddd8ded602464c3fc6020494"},
228
+ {file = "lazy_object_proxy-1.9.0-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:09763491ce220c0299688940f8dc2c5d05fd1f45af1e42e636b2e8b2303e4382"},
229
+ {file = "lazy_object_proxy-1.9.0-cp39-cp39-win32.whl", hash = "sha256:9090d8e53235aa280fc9239a86ae3ea8ac58eff66a705fa6aa2ec4968b95c821"},
230
+ {file = "lazy_object_proxy-1.9.0-cp39-cp39-win_amd64.whl", hash = "sha256:db1c1722726f47e10e0b5fdbf15ac3b8adb58c091d12b3ab713965795036985f"},
231
+ ]
232
+
233
+ [[package]]
234
+ name = "mamba"
235
+ version = "0.11.2"
236
+ description = "The definitive testing tool for Python. Born under the banner of Behavior Driven Development."
237
+ category = "dev"
238
+ optional = false
239
+ python-versions = "*"
240
+ files = [
241
+ {file = "mamba-0.11.2.tar.gz", hash = "sha256:75cfc6dfd287dcccaf86dd753cf48e0a7337487c7c3fafda05a6a67ded6da496"},
242
+ ]
243
+
244
+ [package.dependencies]
245
+ clint = "*"
246
+ coverage = "*"
247
+
248
+ [[package]]
249
+ name = "mccabe"
250
+ version = "0.7.0"
251
+ description = "McCabe checker, plugin for flake8"
252
+ category = "dev"
253
+ optional = false
254
+ python-versions = ">=3.6"
255
+ files = [
256
+ {file = "mccabe-0.7.0-py2.py3-none-any.whl", hash = "sha256:6c2d30ab6be0e4a46919781807b4f0d834ebdd6c6e3dca0bda5a15f863427b6e"},
257
+ {file = "mccabe-0.7.0.tar.gz", hash = "sha256:348e0240c33b60bbdf4e523192ef919f28cb2c3d7d5c7794f74009290f236325"},
258
+ ]
259
+
260
+ [[package]]
261
+ name = "parso"
262
+ version = "0.7.1"
263
+ description = "A Python Parser"
264
+ category = "dev"
265
+ optional = false
266
+ python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*"
267
+ files = [
268
+ {file = "parso-0.7.1-py2.py3-none-any.whl", hash = "sha256:97218d9159b2520ff45eb78028ba8b50d2bc61dcc062a9682666f2dc4bd331ea"},
269
+ {file = "parso-0.7.1.tar.gz", hash = "sha256:caba44724b994a8a5e086460bb212abc5a8bc46951bf4a9a1210745953622eb9"},
270
+ ]
271
+
272
+ [package.extras]
273
+ testing = ["docopt", "pytest (>=3.0.7)"]
274
+
275
+ [[package]]
276
+ name = "platformdirs"
277
+ version = "3.2.0"
278
+ description = "A small Python package for determining appropriate platform-specific dirs, e.g. a \"user data dir\"."
279
+ category = "dev"
280
+ optional = false
281
+ python-versions = ">=3.7"
282
+ files = [
283
+ {file = "platformdirs-3.2.0-py3-none-any.whl", hash = "sha256:ebe11c0d7a805086e99506aa331612429a72ca7cd52a1f0d277dc4adc20cb10e"},
284
+ {file = "platformdirs-3.2.0.tar.gz", hash = "sha256:d5b638ca397f25f979350ff789db335903d7ea010ab28903f57b27e1b16c2b08"},
285
+ ]
286
+
287
+ [package.dependencies]
288
+ typing-extensions = {version = ">=4.5", markers = "python_version < \"3.8\""}
289
+
290
+ [package.extras]
291
+ docs = ["furo (>=2022.12.7)", "proselint (>=0.13)", "sphinx (>=6.1.3)", "sphinx-autodoc-typehints (>=1.22,!=1.23.4)"]
292
+ test = ["appdirs (==1.4.4)", "covdefaults (>=2.3)", "pytest (>=7.2.2)", "pytest-cov (>=4)", "pytest-mock (>=3.10)"]
293
+
294
+ [[package]]
295
+ name = "pylint"
296
+ version = "2.13.9"
297
+ description = "python code static checker"
298
+ category = "dev"
299
+ optional = false
300
+ python-versions = ">=3.6.2"
301
+ files = [
302
+ {file = "pylint-2.13.9-py3-none-any.whl", hash = "sha256:705c620d388035bdd9ff8b44c5bcdd235bfb49d276d488dd2c8ff1736aa42526"},
303
+ {file = "pylint-2.13.9.tar.gz", hash = "sha256:095567c96e19e6f57b5b907e67d265ff535e588fe26b12b5ebe1fc5645b2c731"},
304
+ ]
305
+
306
+ [package.dependencies]
307
+ astroid = ">=2.11.5,<=2.12.0-dev0"
308
+ colorama = {version = "*", markers = "sys_platform == \"win32\""}
309
+ dill = ">=0.2"
310
+ isort = ">=4.2.5,<6"
311
+ mccabe = ">=0.6,<0.8"
312
+ platformdirs = ">=2.2.0"
313
+ tomli = {version = ">=1.1.0", markers = "python_version < \"3.11\""}
314
+ typing-extensions = {version = ">=3.10.0", markers = "python_version < \"3.10\""}
315
+
316
+ [package.extras]
317
+ testutil = ["gitpython (>3)"]
318
+
319
+ [[package]]
320
+ name = "pymupdf"
321
+ version = "1.22.1"
322
+ description = "Python bindings for the PDF toolkit and renderer MuPDF"
323
+ category = "main"
324
+ optional = false
325
+ python-versions = ">=3.7"
326
+ files = [
327
+ {file = "PyMuPDF-1.22.1-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:6bda7a64a1263f1c2b6421ae8803db50d4c8a67de95e05d7a38c313de913b0de"},
328
+ {file = "PyMuPDF-1.22.1-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:b5f62ad244b04b7aa5e7d50b06b8bbc582b2f1d0f2c66013051463d63dfe6c5e"},
329
+ {file = "PyMuPDF-1.22.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ce633b9d522528959988647dfbd2c9144ad5422dd75e89e60039da36a412fd3c"},
330
+ {file = "PyMuPDF-1.22.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:733e7b87765ea55202b042b7c84c6b94185ee29fe3a2bd2ee02681c0fd584033"},
331
+ {file = "PyMuPDF-1.22.1-cp310-cp310-win32.whl", hash = "sha256:701499f0a17ccc8dd80707dbeb3a2e60657a6bdc05be7c8c69fa60eb134e1805"},
332
+ {file = "PyMuPDF-1.22.1-cp310-cp310-win_amd64.whl", hash = "sha256:81fa90d157ef7b2ecd72eedafe9db56d3b0f8c3b392d7a2057f659bfcc1f7cad"},
333
+ {file = "PyMuPDF-1.22.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:4edac1dd8e5c35b55420925b5486bec4427b07a073cd03f6081b7234ed37217e"},
334
+ {file = "PyMuPDF-1.22.1-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:7744b9853fc55df75f6d37a376432eddd450c1d2072f6ef66b392b7229bccdc6"},
335
+ {file = "PyMuPDF-1.22.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:711adc70d664cdd5d361154bb3485546eaa5e8a90827db6abf9c42ca292aa9e1"},
336
+ {file = "PyMuPDF-1.22.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a1d77a3057ad7fc3e2e02e5fedd53199206a49c4b4c5e3ee75458c17d6b739cb"},
337
+ {file = "PyMuPDF-1.22.1-cp311-cp311-win32.whl", hash = "sha256:b5eca48ea55eafcea68b14669a9f5030c15056431b10710d863de9f9a6b1a0ce"},
338
+ {file = "PyMuPDF-1.22.1-cp311-cp311-win_amd64.whl", hash = "sha256:8e0bfbd6195f45326f9182fff04ac2af9568d78fc1f32dcfa15f84a302d8aafe"},
339
+ {file = "PyMuPDF-1.22.1-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:440efca115e70c8cdfc492e98b182e24c565d8e68f26754e28e61cf108a915d9"},
340
+ {file = "PyMuPDF-1.22.1-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a70ab2d38b366c7237adce7d54f3028a7825f165a73c137a1746a6b592d26bb2"},
341
+ {file = "PyMuPDF-1.22.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:7e4a924ffecb8046fbfe7dff9b69f9938389f094dccab07a378850bf9f889c62"},
342
+ {file = "PyMuPDF-1.22.1-cp37-cp37m-win32.whl", hash = "sha256:24e66c2ff4d6cfee5b082c3e2c92b40214799888bf2efcca1f70108c3dfedddb"},
343
+ {file = "PyMuPDF-1.22.1-cp37-cp37m-win_amd64.whl", hash = "sha256:51504bfa2ee207c5c1a38d47b4b91af1bacbd8937b959d947d81fc8f7e023bd8"},
344
+ {file = "PyMuPDF-1.22.1-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:219337a3be00df2bf65071d5e4e1e6759afd06310d4ec7b1c9694a5b03b5d8d6"},
345
+ {file = "PyMuPDF-1.22.1-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:050719cb42a8847d564af1d8509d7290176e7c4fde6da7be5751303fa8237aed"},
346
+ {file = "PyMuPDF-1.22.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5871b9e38e68b92533fb7c6fbe3eb7b059f5071d4c2e3ff51cedcc73c994afbc"},
347
+ {file = "PyMuPDF-1.22.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a5a0332d6dac4ebf32cb7f0c8639b22b56c9475cb87bc0a0361f9cdc9c2d08a1"},
348
+ {file = "PyMuPDF-1.22.1-cp38-cp38-win32.whl", hash = "sha256:127985812c4a2f0106375c4f4916ca68c1559d6b224a050ce75393e454333995"},
349
+ {file = "PyMuPDF-1.22.1-cp38-cp38-win_amd64.whl", hash = "sha256:99764c46fb8df253a3ea9fbb13b132f205561d6227b0d00e673998b18d7280eb"},
350
+ {file = "PyMuPDF-1.22.1-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:fdb21332d28567e278008dd6130564ac0f5de8aff364a1e7809a70a0f969df26"},
351
+ {file = "PyMuPDF-1.22.1-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:88202e42d957a41deff212dcb1d8e16e469d21d09a72ab372ee2f173a22112c8"},
352
+ {file = "PyMuPDF-1.22.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:36b7fd85f5813045f10b65caf4cbdad03b51b07076f07b205853a1e44c898e34"},
353
+ {file = "PyMuPDF-1.22.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:45e601f7b1ee2a0c1a261bb0179eba4a9899117404eccf0a573e6497ed507ea8"},
354
+ {file = "PyMuPDF-1.22.1-cp39-cp39-win32.whl", hash = "sha256:c610acdbd2f2d994130341559f26c098df546a1fc187adee3b63a0f489310808"},
355
+ {file = "PyMuPDF-1.22.1-cp39-cp39-win_amd64.whl", hash = "sha256:af1e6d5dd122c097f23a7e89f8c2197310e85a4c8e8f63ff94444188d9bc0a4e"},
356
+ {file = "PyMuPDF-1.22.1.tar.gz", hash = "sha256:ad34bba78ce147cee50e1dc30fa16f29135a4c3d6a2b1c1b0403ebbcc9fbe4be"},
357
+ ]
358
+
359
+ [[package]]
360
+ name = "setuptools"
361
+ version = "67.7.0"
362
+ description = "Easily download, build, install, upgrade, and uninstall Python packages"
363
+ category = "dev"
364
+ optional = false
365
+ python-versions = ">=3.7"
366
+ files = [
367
+ {file = "setuptools-67.7.0-py3-none-any.whl", hash = "sha256:888be97fde8cc3afd60f7784e678fa29ee13c4e5362daa7104a93bba33646c50"},
368
+ {file = "setuptools-67.7.0.tar.gz", hash = "sha256:b7e53a01c6c654d26d2999ee033d8c6125e5fa55f03b7b193f937ae7ac999f22"},
369
+ ]
370
+
371
+ [package.extras]
372
+ docs = ["furo", "jaraco.packaging (>=9)", "jaraco.tidelift (>=1.4)", "pygments-github-lexers (==0.0.5)", "rst.linker (>=1.9)", "sphinx (>=3.5)", "sphinx-favicon", "sphinx-hoverxref (<2)", "sphinx-inline-tabs", "sphinx-lint", "sphinx-notfound-page (==0.8.3)", "sphinx-reredirects", "sphinxcontrib-towncrier"]
373
+ testing = ["build[virtualenv]", "filelock (>=3.4.0)", "flake8 (<5)", "flake8-2020", "ini2toml[lite] (>=0.9)", "jaraco.envs (>=2.2)", "jaraco.path (>=3.2.0)", "pip (>=19.1)", "pip-run (>=8.8)", "pytest (>=6)", "pytest-black (>=0.3.7)", "pytest-checkdocs (>=2.4)", "pytest-cov", "pytest-enabler (>=1.3)", "pytest-flake8", "pytest-mypy (>=0.9.1)", "pytest-perf", "pytest-timeout", "pytest-xdist", "tomli-w (>=1.0.0)", "virtualenv (>=13.0.0)", "wheel"]
374
+ testing-integration = ["build[virtualenv]", "filelock (>=3.4.0)", "jaraco.envs (>=2.2)", "jaraco.path (>=3.2.0)", "pytest", "pytest-enabler", "pytest-xdist", "tomli", "virtualenv (>=13.0.0)", "wheel"]
375
+
376
+ [[package]]
377
+ name = "toml"
378
+ version = "0.10.2"
379
+ description = "Python Library for Tom's Obvious, Minimal Language"
380
+ category = "main"
381
+ optional = false
382
+ python-versions = ">=2.6, !=3.0.*, !=3.1.*, !=3.2.*"
383
+ files = [
384
+ {file = "toml-0.10.2-py2.py3-none-any.whl", hash = "sha256:806143ae5bfb6a3c6e736a764057db0e6a0e05e338b5630894a5f779cabb4f9b"},
385
+ {file = "toml-0.10.2.tar.gz", hash = "sha256:b3bda1d108d5dd99f4a20d24d9c348e91c4db7ab1b749200bded2f839ccbe68f"},
386
+ ]
387
+
388
+ [[package]]
389
+ name = "tomli"
390
+ version = "2.0.1"
391
+ description = "A lil' TOML parser"
392
+ category = "dev"
393
+ optional = false
394
+ python-versions = ">=3.7"
395
+ files = [
396
+ {file = "tomli-2.0.1-py3-none-any.whl", hash = "sha256:939de3e7a6161af0c887ef91b7d41a53e7c5a1ca976325f429cb46ea9bc30ecc"},
397
+ {file = "tomli-2.0.1.tar.gz", hash = "sha256:de526c12914f0c550d15924c62d72abc48d6fe7364aa87328337a31007fe8a4f"},
398
+ ]
399
+
400
+ [[package]]
401
+ name = "typed-ast"
402
+ version = "1.5.4"
403
+ description = "a fork of Python 2 and 3 ast modules with type comment support"
404
+ category = "dev"
405
+ optional = false
406
+ python-versions = ">=3.6"
407
+ files = [
408
+ {file = "typed_ast-1.5.4-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:669dd0c4167f6f2cd9f57041e03c3c2ebf9063d0757dc89f79ba1daa2bfca9d4"},
409
+ {file = "typed_ast-1.5.4-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:211260621ab1cd7324e0798d6be953d00b74e0428382991adfddb352252f1d62"},
410
+ {file = "typed_ast-1.5.4-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:267e3f78697a6c00c689c03db4876dd1efdfea2f251a5ad6555e82a26847b4ac"},
411
+ {file = "typed_ast-1.5.4-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:c542eeda69212fa10a7ada75e668876fdec5f856cd3d06829e6aa64ad17c8dfe"},
412
+ {file = "typed_ast-1.5.4-cp310-cp310-win_amd64.whl", hash = "sha256:a9916d2bb8865f973824fb47436fa45e1ebf2efd920f2b9f99342cb7fab93f72"},
413
+ {file = "typed_ast-1.5.4-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:79b1e0869db7c830ba6a981d58711c88b6677506e648496b1f64ac7d15633aec"},
414
+ {file = "typed_ast-1.5.4-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a94d55d142c9265f4ea46fab70977a1944ecae359ae867397757d836ea5a3f47"},
415
+ {file = "typed_ast-1.5.4-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:183afdf0ec5b1b211724dfef3d2cad2d767cbefac291f24d69b00546c1837fb6"},
416
+ {file = "typed_ast-1.5.4-cp36-cp36m-win_amd64.whl", hash = "sha256:639c5f0b21776605dd6c9dbe592d5228f021404dafd377e2b7ac046b0349b1a1"},
417
+ {file = "typed_ast-1.5.4-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:cf4afcfac006ece570e32d6fa90ab74a17245b83dfd6655a6f68568098345ff6"},
418
+ {file = "typed_ast-1.5.4-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ed855bbe3eb3715fca349c80174cfcfd699c2f9de574d40527b8429acae23a66"},
419
+ {file = "typed_ast-1.5.4-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:6778e1b2f81dfc7bc58e4b259363b83d2e509a65198e85d5700dfae4c6c8ff1c"},
420
+ {file = "typed_ast-1.5.4-cp37-cp37m-win_amd64.whl", hash = "sha256:0261195c2062caf107831e92a76764c81227dae162c4f75192c0d489faf751a2"},
421
+ {file = "typed_ast-1.5.4-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:2efae9db7a8c05ad5547d522e7dbe62c83d838d3906a3716d1478b6c1d61388d"},
422
+ {file = "typed_ast-1.5.4-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:7d5d014b7daa8b0bf2eaef684295acae12b036d79f54178b92a2b6a56f92278f"},
423
+ {file = "typed_ast-1.5.4-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:370788a63915e82fd6f212865a596a0fefcbb7d408bbbb13dea723d971ed8bdc"},
424
+ {file = "typed_ast-1.5.4-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:4e964b4ff86550a7a7d56345c7864b18f403f5bd7380edf44a3c1fb4ee7ac6c6"},
425
+ {file = "typed_ast-1.5.4-cp38-cp38-win_amd64.whl", hash = "sha256:683407d92dc953c8a7347119596f0b0e6c55eb98ebebd9b23437501b28dcbb8e"},
426
+ {file = "typed_ast-1.5.4-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:4879da6c9b73443f97e731b617184a596ac1235fe91f98d279a7af36c796da35"},
427
+ {file = "typed_ast-1.5.4-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:3e123d878ba170397916557d31c8f589951e353cc95fb7f24f6bb69adc1a8a97"},
428
+ {file = "typed_ast-1.5.4-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ebd9d7f80ccf7a82ac5f88c521115cc55d84e35bf8b446fcd7836eb6b98929a3"},
429
+ {file = "typed_ast-1.5.4-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:98f80dee3c03455e92796b58b98ff6ca0b2a6f652120c263efdba4d6c5e58f72"},
430
+ {file = "typed_ast-1.5.4-cp39-cp39-win_amd64.whl", hash = "sha256:0fdbcf2fef0ca421a3f5912555804296f0b0960f0418c440f5d6d3abb549f3e1"},
431
+ {file = "typed_ast-1.5.4.tar.gz", hash = "sha256:39e21ceb7388e4bb37f4c679d72707ed46c2fbf2a5609b8b8ebc4b067d977df2"},
432
+ ]
433
+
434
+ [[package]]
435
+ name = "typing-extensions"
436
+ version = "4.5.0"
437
+ description = "Backported and Experimental Type Hints for Python 3.7+"
438
+ category = "dev"
439
+ optional = false
440
+ python-versions = ">=3.7"
441
+ files = [
442
+ {file = "typing_extensions-4.5.0-py3-none-any.whl", hash = "sha256:fb33085c39dd998ac16d1431ebc293a8b3eedd00fd4a32de0ff79002c19511b4"},
443
+ {file = "typing_extensions-4.5.0.tar.gz", hash = "sha256:5cb5f4a79139d699607b3ef622a1dedafa84e115ab0024e0d9c044a9479ca7cb"},
444
+ ]
445
+
446
+ [[package]]
447
+ name = "wrapt"
448
+ version = "1.15.0"
449
+ description = "Module for decorators, wrappers and monkey patching."
450
+ category = "dev"
451
+ optional = false
452
+ python-versions = "!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*,>=2.7"
453
+ files = [
454
+ {file = "wrapt-1.15.0-cp27-cp27m-macosx_10_9_x86_64.whl", hash = "sha256:ca1cccf838cd28d5a0883b342474c630ac48cac5df0ee6eacc9c7290f76b11c1"},
455
+ {file = "wrapt-1.15.0-cp27-cp27m-manylinux1_i686.whl", hash = "sha256:e826aadda3cae59295b95343db8f3d965fb31059da7de01ee8d1c40a60398b29"},
456
+ {file = "wrapt-1.15.0-cp27-cp27m-manylinux1_x86_64.whl", hash = "sha256:5fc8e02f5984a55d2c653f5fea93531e9836abbd84342c1d1e17abc4a15084c2"},
457
+ {file = "wrapt-1.15.0-cp27-cp27m-manylinux2010_i686.whl", hash = "sha256:96e25c8603a155559231c19c0349245eeb4ac0096fe3c1d0be5c47e075bd4f46"},
458
+ {file = "wrapt-1.15.0-cp27-cp27m-manylinux2010_x86_64.whl", hash = "sha256:40737a081d7497efea35ab9304b829b857f21558acfc7b3272f908d33b0d9d4c"},
459
+ {file = "wrapt-1.15.0-cp27-cp27mu-manylinux1_i686.whl", hash = "sha256:f87ec75864c37c4c6cb908d282e1969e79763e0d9becdfe9fe5473b7bb1e5f09"},
460
+ {file = "wrapt-1.15.0-cp27-cp27mu-manylinux1_x86_64.whl", hash = "sha256:1286eb30261894e4c70d124d44b7fd07825340869945c79d05bda53a40caa079"},
461
+ {file = "wrapt-1.15.0-cp27-cp27mu-manylinux2010_i686.whl", hash = "sha256:493d389a2b63c88ad56cdc35d0fa5752daac56ca755805b1b0c530f785767d5e"},
462
+ {file = "wrapt-1.15.0-cp27-cp27mu-manylinux2010_x86_64.whl", hash = "sha256:58d7a75d731e8c63614222bcb21dd992b4ab01a399f1f09dd82af17bbfc2368a"},
463
+ {file = "wrapt-1.15.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:21f6d9a0d5b3a207cdf7acf8e58d7d13d463e639f0c7e01d82cdb671e6cb7923"},
464
+ {file = "wrapt-1.15.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:ce42618f67741d4697684e501ef02f29e758a123aa2d669e2d964ff734ee00ee"},
465
+ {file = "wrapt-1.15.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:41d07d029dd4157ae27beab04d22b8e261eddfc6ecd64ff7000b10dc8b3a5727"},
466
+ {file = "wrapt-1.15.0-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:54accd4b8bc202966bafafd16e69da9d5640ff92389d33d28555c5fd4f25ccb7"},
467
+ {file = "wrapt-1.15.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:2fbfbca668dd15b744418265a9607baa970c347eefd0db6a518aaf0cfbd153c0"},
468
+ {file = "wrapt-1.15.0-cp310-cp310-musllinux_1_1_aarch64.whl", hash = "sha256:76e9c727a874b4856d11a32fb0b389afc61ce8aaf281ada613713ddeadd1cfec"},
469
+ {file = "wrapt-1.15.0-cp310-cp310-musllinux_1_1_i686.whl", hash = "sha256:e20076a211cd6f9b44a6be58f7eeafa7ab5720eb796975d0c03f05b47d89eb90"},
470
+ {file = "wrapt-1.15.0-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:a74d56552ddbde46c246b5b89199cb3fd182f9c346c784e1a93e4dc3f5ec9975"},
471
+ {file = "wrapt-1.15.0-cp310-cp310-win32.whl", hash = "sha256:26458da5653aa5b3d8dc8b24192f574a58984c749401f98fff994d41d3f08da1"},
472
+ {file = "wrapt-1.15.0-cp310-cp310-win_amd64.whl", hash = "sha256:75760a47c06b5974aa5e01949bf7e66d2af4d08cb8c1d6516af5e39595397f5e"},
473
+ {file = "wrapt-1.15.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:ba1711cda2d30634a7e452fc79eabcadaffedf241ff206db2ee93dd2c89a60e7"},
474
+ {file = "wrapt-1.15.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:56374914b132c702aa9aa9959c550004b8847148f95e1b824772d453ac204a72"},
475
+ {file = "wrapt-1.15.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a89ce3fd220ff144bd9d54da333ec0de0399b52c9ac3d2ce34b569cf1a5748fb"},
476
+ {file = "wrapt-1.15.0-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:3bbe623731d03b186b3d6b0d6f51865bf598587c38d6f7b0be2e27414f7f214e"},
477
+ {file = "wrapt-1.15.0-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:3abbe948c3cbde2689370a262a8d04e32ec2dd4f27103669a45c6929bcdbfe7c"},
478
+ {file = "wrapt-1.15.0-cp311-cp311-musllinux_1_1_aarch64.whl", hash = "sha256:b67b819628e3b748fd3c2192c15fb951f549d0f47c0449af0764d7647302fda3"},
479
+ {file = "wrapt-1.15.0-cp311-cp311-musllinux_1_1_i686.whl", hash = "sha256:7eebcdbe3677e58dd4c0e03b4f2cfa346ed4049687d839adad68cc38bb559c92"},
480
+ {file = "wrapt-1.15.0-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:74934ebd71950e3db69960a7da29204f89624dde411afbfb3b4858c1409b1e98"},
481
+ {file = "wrapt-1.15.0-cp311-cp311-win32.whl", hash = "sha256:bd84395aab8e4d36263cd1b9308cd504f6cf713b7d6d3ce25ea55670baec5416"},
482
+ {file = "wrapt-1.15.0-cp311-cp311-win_amd64.whl", hash = "sha256:a487f72a25904e2b4bbc0817ce7a8de94363bd7e79890510174da9d901c38705"},
483
+ {file = "wrapt-1.15.0-cp35-cp35m-manylinux1_i686.whl", hash = "sha256:4ff0d20f2e670800d3ed2b220d40984162089a6e2c9646fdb09b85e6f9a8fc29"},
484
+ {file = "wrapt-1.15.0-cp35-cp35m-manylinux1_x86_64.whl", hash = "sha256:9ed6aa0726b9b60911f4aed8ec5b8dd7bf3491476015819f56473ffaef8959bd"},
485
+ {file = "wrapt-1.15.0-cp35-cp35m-manylinux2010_i686.whl", hash = "sha256:896689fddba4f23ef7c718279e42f8834041a21342d95e56922e1c10c0cc7afb"},
486
+ {file = "wrapt-1.15.0-cp35-cp35m-manylinux2010_x86_64.whl", hash = "sha256:75669d77bb2c071333417617a235324a1618dba66f82a750362eccbe5b61d248"},
487
+ {file = "wrapt-1.15.0-cp35-cp35m-win32.whl", hash = "sha256:fbec11614dba0424ca72f4e8ba3c420dba07b4a7c206c8c8e4e73f2e98f4c559"},
488
+ {file = "wrapt-1.15.0-cp35-cp35m-win_amd64.whl", hash = "sha256:fd69666217b62fa5d7c6aa88e507493a34dec4fa20c5bd925e4bc12fce586639"},
489
+ {file = "wrapt-1.15.0-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:b0724f05c396b0a4c36a3226c31648385deb6a65d8992644c12a4963c70326ba"},
490
+ {file = "wrapt-1.15.0-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:bbeccb1aa40ab88cd29e6c7d8585582c99548f55f9b2581dfc5ba68c59a85752"},
491
+ {file = "wrapt-1.15.0-cp36-cp36m-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:38adf7198f8f154502883242f9fe7333ab05a5b02de7d83aa2d88ea621f13364"},
492
+ {file = "wrapt-1.15.0-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:578383d740457fa790fdf85e6d346fda1416a40549fe8db08e5e9bd281c6a475"},
493
+ {file = "wrapt-1.15.0-cp36-cp36m-musllinux_1_1_aarch64.whl", hash = "sha256:a4cbb9ff5795cd66f0066bdf5947f170f5d63a9274f99bdbca02fd973adcf2a8"},
494
+ {file = "wrapt-1.15.0-cp36-cp36m-musllinux_1_1_i686.whl", hash = "sha256:af5bd9ccb188f6a5fdda9f1f09d9f4c86cc8a539bd48a0bfdc97723970348418"},
495
+ {file = "wrapt-1.15.0-cp36-cp36m-musllinux_1_1_x86_64.whl", hash = "sha256:b56d5519e470d3f2fe4aa7585f0632b060d532d0696c5bdfb5e8319e1d0f69a2"},
496
+ {file = "wrapt-1.15.0-cp36-cp36m-win32.whl", hash = "sha256:77d4c1b881076c3ba173484dfa53d3582c1c8ff1f914c6461ab70c8428b796c1"},
497
+ {file = "wrapt-1.15.0-cp36-cp36m-win_amd64.whl", hash = "sha256:077ff0d1f9d9e4ce6476c1a924a3332452c1406e59d90a2cf24aeb29eeac9420"},
498
+ {file = "wrapt-1.15.0-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:5c5aa28df055697d7c37d2099a7bc09f559d5053c3349b1ad0c39000e611d317"},
499
+ {file = "wrapt-1.15.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:3a8564f283394634a7a7054b7983e47dbf39c07712d7b177b37e03f2467a024e"},
500
+ {file = "wrapt-1.15.0-cp37-cp37m-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:780c82a41dc493b62fc5884fb1d3a3b81106642c5c5c78d6a0d4cbe96d62ba7e"},
501
+ {file = "wrapt-1.15.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e169e957c33576f47e21864cf3fc9ff47c223a4ebca8960079b8bd36cb014fd0"},
502
+ {file = "wrapt-1.15.0-cp37-cp37m-musllinux_1_1_aarch64.whl", hash = "sha256:b02f21c1e2074943312d03d243ac4388319f2456576b2c6023041c4d57cd7019"},
503
+ {file = "wrapt-1.15.0-cp37-cp37m-musllinux_1_1_i686.whl", hash = "sha256:f2e69b3ed24544b0d3dbe2c5c0ba5153ce50dcebb576fdc4696d52aa22db6034"},
504
+ {file = "wrapt-1.15.0-cp37-cp37m-musllinux_1_1_x86_64.whl", hash = "sha256:d787272ed958a05b2c86311d3a4135d3c2aeea4fc655705f074130aa57d71653"},
505
+ {file = "wrapt-1.15.0-cp37-cp37m-win32.whl", hash = "sha256:02fce1852f755f44f95af51f69d22e45080102e9d00258053b79367d07af39c0"},
506
+ {file = "wrapt-1.15.0-cp37-cp37m-win_amd64.whl", hash = "sha256:abd52a09d03adf9c763d706df707c343293d5d106aea53483e0ec8d9e310ad5e"},
507
+ {file = "wrapt-1.15.0-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:cdb4f085756c96a3af04e6eca7f08b1345e94b53af8921b25c72f096e704e145"},
508
+ {file = "wrapt-1.15.0-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:230ae493696a371f1dbffaad3dafbb742a4d27a0afd2b1aecebe52b740167e7f"},
509
+ {file = "wrapt-1.15.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:63424c681923b9f3bfbc5e3205aafe790904053d42ddcc08542181a30a7a51bd"},
510
+ {file = "wrapt-1.15.0-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:d6bcbfc99f55655c3d93feb7ef3800bd5bbe963a755687cbf1f490a71fb7794b"},
511
+ {file = "wrapt-1.15.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c99f4309f5145b93eca6e35ac1a988f0dc0a7ccf9ccdcd78d3c0adf57224e62f"},
512
+ {file = "wrapt-1.15.0-cp38-cp38-musllinux_1_1_aarch64.whl", hash = "sha256:b130fe77361d6771ecf5a219d8e0817d61b236b7d8b37cc045172e574ed219e6"},
513
+ {file = "wrapt-1.15.0-cp38-cp38-musllinux_1_1_i686.whl", hash = "sha256:96177eb5645b1c6985f5c11d03fc2dbda9ad24ec0f3a46dcce91445747e15094"},
514
+ {file = "wrapt-1.15.0-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:d5fe3e099cf07d0fb5a1e23d399e5d4d1ca3e6dfcbe5c8570ccff3e9208274f7"},
515
+ {file = "wrapt-1.15.0-cp38-cp38-win32.whl", hash = "sha256:abd8f36c99512755b8456047b7be10372fca271bf1467a1caa88db991e7c421b"},
516
+ {file = "wrapt-1.15.0-cp38-cp38-win_amd64.whl", hash = "sha256:b06fa97478a5f478fb05e1980980a7cdf2712015493b44d0c87606c1513ed5b1"},
517
+ {file = "wrapt-1.15.0-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:2e51de54d4fb8fb50d6ee8327f9828306a959ae394d3e01a1ba8b2f937747d86"},
518
+ {file = "wrapt-1.15.0-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:0970ddb69bba00670e58955f8019bec4a42d1785db3faa043c33d81de2bf843c"},
519
+ {file = "wrapt-1.15.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:76407ab327158c510f44ded207e2f76b657303e17cb7a572ffe2f5a8a48aa04d"},
520
+ {file = "wrapt-1.15.0-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:cd525e0e52a5ff16653a3fc9e3dd827981917d34996600bbc34c05d048ca35cc"},
521
+ {file = "wrapt-1.15.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:9d37ac69edc5614b90516807de32d08cb8e7b12260a285ee330955604ed9dd29"},
522
+ {file = "wrapt-1.15.0-cp39-cp39-musllinux_1_1_aarch64.whl", hash = "sha256:078e2a1a86544e644a68422f881c48b84fef6d18f8c7a957ffd3f2e0a74a0d4a"},
523
+ {file = "wrapt-1.15.0-cp39-cp39-musllinux_1_1_i686.whl", hash = "sha256:2cf56d0e237280baed46f0b5316661da892565ff58309d4d2ed7dba763d984b8"},
524
+ {file = "wrapt-1.15.0-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:7dc0713bf81287a00516ef43137273b23ee414fe41a3c14be10dd95ed98a2df9"},
525
+ {file = "wrapt-1.15.0-cp39-cp39-win32.whl", hash = "sha256:46ed616d5fb42f98630ed70c3529541408166c22cdfd4540b88d5f21006b0eff"},
526
+ {file = "wrapt-1.15.0-cp39-cp39-win_amd64.whl", hash = "sha256:eef4d64c650f33347c1f9266fa5ae001440b232ad9b98f1f43dfe7a79435c0a6"},
527
+ {file = "wrapt-1.15.0-py3-none-any.whl", hash = "sha256:64b1df0f83706b4ef4cfb4fb0e4c2669100fd7ecacfb59e091fad300d4e04640"},
528
+ {file = "wrapt-1.15.0.tar.gz", hash = "sha256:d06730c6aed78cee4126234cf2d071e01b44b915e725a6cb439a879ec9754a3a"},
529
+ ]
530
+
531
+ [metadata]
532
+ lock-version = "2.0"
533
+ python-versions = "^3.7"
534
+ content-hash = "6dd48af9ea10e0d441e2b6ee3dcdea67bd5b4cc0b6c13b672761212decbaa5f6"
pyproject.toml ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [tool.poetry]
2
+ name = "pdf.tocgen"
3
+ version = "1.3.4"
4
+ description = "Automatically generate table of contents for pdf files"
5
+ authors = ["krasjet"]
6
+ license = "GPL-3.0-or-later"
7
+ readme = "README.md"
8
+ homepage = "https://krasjet.com/voice/pdf.tocgen/"
9
+ repository = "https://github.com/Krasjet/pdf.tocgen"
10
+ keywords = ["pdf", "cli"]
11
+
12
+ classifiers = [
13
+ "Development Status :: 3 - Alpha",
14
+ "Environment :: Console",
15
+ "Intended Audience :: End Users/Desktop"
16
+ ]
17
+
18
+ packages = [
19
+ { include = "pdfxmeta" },
20
+ { include = "pdftocgen" },
21
+ { include = "pdftocio" },
22
+ { include = "fitzutils" }
23
+ ]
24
+
25
+ [tool.poetry.dependencies]
26
+ python = "^3.7"
27
+ PyMuPDF = "^1.18.14"
28
+ toml = "^0.10.2"
29
+ chardet = "^5.1.0"
30
+
31
+ [tool.poetry.dev-dependencies]
32
+ pylint = "^2.5.3"
33
+ jedi = "^0.17.2"
34
+ mamba = "^0.11.1"
35
+
36
+ [tool.poetry.scripts]
37
+ pdfxmeta = "pdfxmeta.app:main"
38
+ pdftocgen = "pdftocgen.app:main"
39
+ pdftocio = "pdftocio.app:main"
40
+
41
+ [build-system]
42
+ requires = ["poetry-core"]
43
+ build-backend = "poetry.core.masonry.api"
recipes/README.md ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ recipes
2
+ =======
3
+
4
+ This directory contains some pre-made recipes for `pdftocgen`. It could be a
5
+ good reference if you want to craft your own recipes. Feel free to contribute
6
+ more.
7
+
8
+ The recipes in this directory is separately licensed under the [CC BY-NC-SA 4.0
9
+ License][cc] to prevent any commercial usage.
10
+
11
+ [cc]: https://creativecommons.org/licenses/by-nc-sa/4.0/
recipes/default_groff_man.toml ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # The recipe for
2
+ # $ man -Tpdf man > out.pdf
3
+ # only tested under groff
4
+ [[heading]]
5
+ level = 1
6
+ font.name = "Times-Bold"
7
+ font.size = 10.949999809265137
8
+ font.superscript = false
9
+ font.italic = false
10
+ font.serif = true
11
+ font.monospace = false
12
+ font.bold = true
recipes/default_groff_ms.toml ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # The recipe for the default groff_ms, produced by
2
+ # $ groff -ms -Tpdf in.ms > out.pdf
3
+
4
+ [[heading]]
5
+ level = 1
6
+ font.name = "Times-Bold"
7
+ font.size = 10
8
+ bbox.left = 72
9
+
10
+ # All the headings (.NH) have the same font attributes, so you need to manually
11
+ # format the heading levels of the toc (for vim users, >> in normal mode will
12
+ # add indentation to a line)
recipes/default_latex.toml ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # The recipe for
2
+ # $ pdflatex in.tex
3
+ # under default styles (Computer Modern, article class)
4
+
5
+ [[heading]]
6
+ level = 1
7
+ greedy = true
8
+ font.name = "CMBX12"
9
+ font.size = 14.346199989318848
10
+ font.size_tolerance = 0.01
11
+
12
+ [[heading]]
13
+ level = 2
14
+ greedy = true
15
+ font.name = "CMBX12"
16
+ font.size = 11.9552001953125
17
+ font.size_tolerance = 0.01
18
+
19
+ [[heading]]
20
+ level = 3
21
+ greedy = true
22
+ font.name = "CMBX10"
23
+ font.size = 9.962599754333496
24
+ font.size_tolerance = 0.01
recipes/ft.toml ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # The recipe for "Lecture Notes for EE 261" [1] by Brad Osgood
2
+ #
3
+ # [1]: https://see.stanford.edu/materials/lsoftaee261/book-fall-07.pdf
4
+ # archive: https://web.archive.org/https://see.stanford.edu/materials/lsoftaee261/book-fall-07.pdf
5
+
6
+ [[heading]]
7
+ level = 1
8
+ greedy = true
9
+ font.name = "CMBX12"
10
+ font.size = 24.78696060180664
11
+
12
+ [[heading]]
13
+ level = 2
14
+ greedy = true
15
+ font.name = "CMBX12"
16
+ font.size = 14.346190452575684
17
+
18
+ [[heading]]
19
+ level = 3
20
+ greedy = true
21
+ font.name = "CMBX12"
22
+ font.size = 11.955169677734375
23
+
recipes/htdc.toml ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # The recipe for HtDC by Matthias Felleisen, et al.
2
+ #
3
+ # The output need some manual clean up. For example, the table of contents in
4
+ # the original document is incorrectedly included in the outline, but they
5
+ # should be easy to remove using a text editor.
6
+ #
7
+ # [1]: https://felleisen.org/matthias/HtDC/htdc.pdf
8
+
9
+ [[heading]]
10
+ level = 1
11
+ font.name = "Palatino-Bold"
12
+ font.size = 17.21540069580078
13
+ font.color = 0x221f1f
14
+
15
+ [[heading]]
16
+ level = 2
17
+ font.name = "Palatino-Bold"
18
+ font.size = 14.346199989318848
19
+ font.color = 0x221f1f
20
+
21
+ [[heading]]
22
+ level = 3
23
+ greedy = true
24
+ font.name = "Palatino-Bold"
25
+ font.size = 11.9552001953125
26
+ font.color = 0x221f1f
recipes/onlisp.toml ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # The recipe for "On Lisp" [1] by Paul Graham
2
+ #
3
+ # Note that you need to download the PDF version. The PDF is well structured
4
+ # and no extra processing is needed.
5
+ # [1]: http://www.paulgraham.com/onlisptext.html
6
+
7
+ [[heading]]
8
+ level = 1
9
+ font.name = "Times-Bold"
10
+ font.size = 19.92530059814453
11
+
12
+ [[heading]]
13
+ level = 2
14
+ font.name = "Times-Bold"
15
+ font.size = 11.9552001953125
recipes/recipe.toml ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ [[heading]]
2
+ level = 1
3
+ greedy = true
4
+ font.name = "CaslonFiveForty-Roman"
5
+ font.size = 54.10
requirements.txt CHANGED
@@ -1,3 +1,6 @@
1
- altair
2
- pandas
3
- streamlit
 
 
 
 
1
+ streamlit
2
+ pandas
3
+ PyMuPDF==1.25.2
4
+ toml
5
+ chardet
6
+ .
spec/__init__.py ADDED
File without changes
spec/cli_spec.sh ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash -e
2
+
3
+ SPEC="spec/files"
4
+
5
+ checkeq() {
6
+ if res=$(diff "$1" "$2"); then
7
+ echo "[✓]"
8
+ else
9
+ echo "[✗]"
10
+ printf "%s\n" "$res"
11
+ return 1
12
+ fi
13
+ }
14
+
15
+ it() {
16
+ printf " it %s " "$*"
17
+ }
18
+
19
+ printf "pdfxmeta\n"
20
+
21
+ it "extracts metadata correctly"
22
+ checkeq <(pdfxmeta -p 1 "$SPEC/level2.pdf" "Section") \
23
+ "$SPEC/level2_meta"
24
+
25
+ it "extracts metadata in auto mode correctly"
26
+ checkeq <(pdfxmeta -a 1 -p 1 "$SPEC/level2.pdf" "Section") \
27
+ "$SPEC/level2_meta.toml"
28
+
29
+ printf "\npdftocgen\n"
30
+
31
+ it "generates toc for 2 level heading correctly"
32
+ checkeq <(pdftocgen "$SPEC/level2.pdf" < "$SPEC/level2_recipe.toml") \
33
+ "$SPEC/level2.toc"
34
+
35
+ it "generates toc for one page headings correctly"
36
+ checkeq <(pdftocgen "$SPEC/onepage.pdf" < "$SPEC/onepage_greedy.toml") \
37
+ "$SPEC/onepage.toc"
38
+
39
+ it "generates toc for hard mode correctly"
40
+ checkeq <(pdftocgen "$SPEC/hardmode.pdf" < "$SPEC/hardmode_recipe.toml") \
41
+ "$SPEC/hardmode.toc"
42
+
43
+ it "generates readable toc"
44
+ checkeq <(pdftocgen -H "$SPEC/level2.pdf" < "$SPEC/level2_recipe.toml") \
45
+ "$SPEC/level2_h.toc"
46
+
47
+ printf "\npdftocio\n"
48
+
49
+ tmpdir=$(mktemp -d)
50
+
51
+ it "adds toc to pdf and prints toc correctly"
52
+ checkeq <(pdftocgen "$SPEC/hardmode.pdf" < "$SPEC/hardmode_recipe.toml" | \
53
+ pdftocio -o "$tmpdir/out.pdf" "$SPEC/hardmode.pdf" && \
54
+ pdftocio -p "$tmpdir/out.pdf") \
55
+ "$SPEC/hardmode.toc"
56
+
57
+ it "prints toc when -p is set"
58
+ checkeq <(pdftocio -p "$SPEC/hastoc.pdf" < $SPEC/level2.toc) \
59
+ "$SPEC/hastoc.toc"
60
+
61
+ it "prints toc vpos when -v is set"
62
+ checkeq <(pdftocio -p -v "$SPEC/hastoc.pdf") \
63
+ "$SPEC/hastoc_v.toc"
spec/files/Makefile ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ .PHONY: all clean
2
+
3
+ all: level2.pdf hastoc.pdf onepage.pdf hardmode.pdf
4
+
5
+ %.pdf: %.tex
6
+ latexmk -pdf $<
7
+
8
+ clean:
9
+ rm -f *.aux *.dvi *.fdb_latexmk *.fls *.log *.out
10
+
11
+ nuke: clean
12
+ rm -f *.pdf
spec/files/hardmode.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9be6a1628292675b467b36a503c37ffa4d3073d2ff87d147dced3b3bff394875
3
+ size 110985
spec/files/hardmode.tex ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ \documentclass{article}[12pt]
2
+
3
+ \usepackage{lipsum}
4
+ \usepackage{multicol}
5
+ \usepackage{amsmath}
6
+ \usepackage{amsfonts}
7
+ \usepackage[USenglish]{babel}
8
+ \usepackage[stretch=10,shrink=10]{microtype}
9
+ \usepackage[left=1.3in,
10
+ right=1.3in,
11
+ top=1in,
12
+ bottom=1in,
13
+ footskip=.5in]{geometry}
14
+ \setlength{\columnsep}{0.4in}
15
+
16
+ \renewcommand{\rmdefault}{zpltlf}
17
+ \usepackage{newpxtext}
18
+ % will mess up embeded symbols
19
+ % \usepackage{newpxmath}
20
+
21
+ \title{The hard mode}
22
+ \author{krasjet}
23
+ \date{}
24
+
25
+ \begin{document}
26
+ \begin{multicols}{2}
27
+ [
28
+ \maketitle
29
+ ]
30
+
31
+ \section{Section One}
32
+
33
+ \lipsum[2-3]
34
+
35
+ \section{Section $1 + 1 = 2$}
36
+
37
+ \lipsum[2-1]
38
+ \begin{align*}
39
+ x^2 + 2 = 4
40
+ \end{align*}
41
+ \lipsum[2-1]
42
+
43
+ \subsection{Subsection Two.One}
44
+ \lipsum[2-5]
45
+
46
+ \section*{$\mathrm{e}^{\ln(3)}$}
47
+
48
+ \setcounter{section}{3}
49
+ \setcounter{subsection}{0}
50
+
51
+ \lipsum[1-2]
52
+
53
+ \subsection{Subsection $\mathrm{e}^{\ln(3)}$.1, with looo\-ooooooooong title}
54
+ \lipsum[2-5]
55
+
56
+ \subsection{$\mathbb{S}$ubsection Three.Two, another long title}
57
+ \lipsum[1-1]
58
+
59
+ \subsection{Subsection Three.Three}
60
+ \lipsum[2-3]
61
+
62
+ \section{The $x \to \infty$ End}
63
+
64
+ \lipsum[2-2]
65
+
66
+ \end{multicols}
67
+
68
+ \end{document}
spec/files/hardmode.toc ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ "1 Section One" 1
2
+ "2 Section 1 + 1 = 2" 1
3
+ "2.1 Subsection Two.One" 1
4
+ "e ln(3)" 2
5
+ "3.1 Subsection e ln(3) .1, with looo- ooooooooong title" 2
6
+ "3.2 S ubsection Three.Two, another long title" 3
7
+ "3.3 Subsection Three.Three" 3
8
+ "4 The x → ∞ End" 3
spec/files/hardmode_recipe.toml ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [[heading]]
2
+ level = 1
3
+ greedy = true
4
+ font.name = "TeXGyrePagellaX-Bold"
5
+ font.size = 14.346199989318848
6
+
7
+ [[heading]]
8
+ level = 1
9
+ greedy = true
10
+ font.name = "CMR10"
11
+ font.size = 9.962599754333496
12
+ font.superscript = true
13
+
14
+ [[heading]]
15
+ level = 2
16
+ greedy = true
17
+ font.name = "TeXGyrePagellaX-Bold"
18
+ font.size = 11.9552001953125
spec/files/hastoc.pdf ADDED
Binary file (62.7 kB). View file
 
spec/files/hastoc.tex ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ \documentclass{article}
2
+
3
+ \usepackage{lipsum}
4
+ \usepackage{hyperref}
5
+
6
+ \title{2 Level Heading Test}
7
+ \author{krasjet}
8
+ \date{}
9
+
10
+ \begin{document}
11
+ \maketitle
12
+
13
+ \section{Section One}
14
+
15
+ \lipsum[2-4]
16
+
17
+ \section{Section Two}
18
+
19
+ \lipsum[2-5]
20
+
21
+ \subsection{Subsection Two.One}
22
+ \lipsum[2-5]
23
+
24
+ \section{Section Three, with looong loooong looong title}
25
+
26
+ \lipsum[1-2]
27
+
28
+ \subsection{Subsection Three.One, with even loooooooooooonger title, and
29
+ probably even more}
30
+ \lipsum[2-5]
31
+
32
+ \subsection{Subsection Three.Two}
33
+ \lipsum[1-1]
34
+
35
+ \subsection{Subsection Three.Three}
36
+ \lipsum[2-3]
37
+
38
+ \section{The End}
39
+
40
+ \lipsum[2-5]
41
+
42
+ \end{document}
spec/files/hastoc.toc ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ "Section One" 1
2
+ "Section Two" 1
3
+ "Subsection Two.One" 2
4
+ "Section Three, with looong loooong looong title" 3
5
+ "Subsection Three.One, with even loooooooooooonger title, and probably even more" 3
6
+ "Subsection Three.Two" 4
7
+ "Subsection Three.Three" 5
8
+ "The End" 5
spec/files/hastoc_v.toc ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ "Section One" 1 234.65998
2
+ "Section Two" 1 562.148
3
+ "Subsection Two.One" 2 449.522
4
+ "Section Three, with looong loooong looong title" 3 330.333
5
+ "Subsection Three.One, with even loooooooooooonger title, and probably even more" 3 616.444
6
+ "Subsection Three.Two" 4 509.298
7
+ "Subsection Three.Three" 5 124.802
8
+ "The End" 5 361.387
spec/files/level2.pdf ADDED
Binary file (61.8 kB). View file
 
spec/files/level2.tex ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ \documentclass{article}
2
+
3
+ \usepackage{lipsum}
4
+
5
+ \title{2 Level Heading Test}
6
+ \author{krasjet}
7
+ \date{}
8
+
9
+ \begin{document}
10
+ \maketitle
11
+
12
+ \section{Section One}
13
+
14
+ \lipsum[2-4]
15
+
16
+ \section{Section Two}
17
+
18
+ \lipsum[2-5]
19
+
20
+ \subsection{Subsection Two.One}
21
+ \lipsum[2-5]
22
+
23
+ \section{Section Three, with looong loooong looong title}
24
+
25
+ \lipsum[1-2]
26
+
27
+ \subsection{Subsection Three.One, with even loooooooooooonger title, and
28
+ probably even more}
29
+ \lipsum[2-5]
30
+
31
+ \subsection{Subsection Three.Two}
32
+ \lipsum[1-1]
33
+
34
+ \subsection{Subsection Three.Three}
35
+ \lipsum[2-3]
36
+
37
+ \section{The End}
38
+
39
+ \lipsum[2-5]
40
+
41
+ \end{document}