Scrapling / scrapling /parser.py

Commit History

- Add pre-compiled XPath text selector
133ff8f

mph commited on

fix: Selector.get_all_text() doesn't get all text #167
c9a1787

mph commited on

fix(parser)!: Optimize parser for repeated operations
a1afba2

Karim shoair commited on

fix: add typed overloads to Selectors.get() for proper default type inference
b942a24

Karim shoair commited on

style: Fix all mypy errors and add type hints to untyped function bodies
31c2447

Karim shoair commited on

style(parser): Improve the type hint for `find_by_text` and `find_by_regex`
f67ebd1

Karim shoair commited on

feat(parser)!: Make all selection return selector objects by default
a5f9b38

Karim shoair commited on

fix(parser): handle responses with empty body
979d1e6

Karim shoair commited on

fix(parser): Improve response to json conversion
e7f611f

Karim shoair commited on

fix(parser): Better approach for web pages where the encoding is not always correctly declared
58ce87c

Karim shoair commited on

style: removing dead code from the parser
07d0ddd

Karim shoair commited on

refactor: Making all the codebase acceptable by PyRight
e5ecf76

Karim shoair commited on

perf: General code restructure to not use more than needed memory
21ce5a7

Karim shoair commited on

fix(parser): An encoding issue with converting bytes to string on some encoding types
335667a

Karim shoair commited on

style: Removing dead code/docstrings and correcting type hints
7231786

Karim shoair commited on

fix(parser): Improve selectors `re` function
c40149b

Karim shoair commited on

fix(parser): Make `html_content` and `prettify` return strings not bytes (depends on the encoding)
c86e18f

Karim shoair commited on

feat: Make `.body` return the passed content as it is without any processing
76ae95e

Karim shoair commited on

fix: Fixes for multiple encoding issues (#80 & #81 )
450d5ca

Karim shoair commited on

refactor: Make all fetchers as an optional dependency group
a85d2c8

Karim shoair commited on

style: applying the new ruff rules to all files
226b463

Karim shoair commited on

docs: update doc strings with correct naming
4893321

Karim shoair commited on

style: add flags for tests coverage
9b40891

Karim shoair commited on

fix(parser): count all nested children of ignored tags in `get_all_text`
b2a2624

Karim shoair commited on

perf: speeding up `find_by_text` and `find_by_regex` by 3%
574271a

Karim shoair commited on

perf: Speeding up `below_elements` and `relocate` by 3%
f7b1a1a

Karim shoair commited on

perf: optimize `get_all_text` and adaptive logic by another 10%
3e2308a

Karim shoair commited on

perf: optimizing `find_similar` method
e7eec4a

Karim shoair commited on

perf: Optimizing `next` and `previous` properties
32c7833

Karim shoair commited on

perf: speed up `get_all_text` function by another 20%
21d6b5e

Karim shoair commited on

perf(parser): Speeding up `css_first` and `xpath_first` than normal ones
819924f

Karim shoair commited on

perf(parser): A lot of optimizations to speed things up
1fee013

Karim shoair commited on

perf(parser): A lot of optimizations to speed things up
7d01598

Karim shoair commited on

fix: moving types to use Union again
631fd95

Karim shoair commited on

style: Adjustments to the translator
6b11e98

Karim shoair commited on

style: using `isinstance` function as the main way for type checking
4434909

Karim shoair commited on

style: A lot of type hints correction
916182a

Karim shoair commited on

fix: shortcuts for backward compatibility
c32f33c

Karim shoair commited on

style(parser): optimize selectors instances creation
e3e46c8

Karim shoair commited on

style: replacing `os` with `Pathlib` and small optimizations
11165c4

Karim shoair commited on

style: General type hints fixes and imports optimizing
1d98b51

Karim shoair commited on

feat: add `length` property to `Selectors` to write less code
ee0de9b

Karim shoair commited on

refactor(parser): optimize imports
ff3cdb9

Karim shoair commited on

refactor: replace's Selector inpt (text/body) with 1 argument called `content`
3db9c55

Karim shoair commited on

refactor: huge change, many features/class got a better naming
8e67a4c

Karim shoair commited on

refactor: remove clean up function for Adaptor + make adaptor attributes accessible directly
02a714b

Karim shoair commited on

refactor(parser): Multiple optimizations and fixes
b9977b1

Karim shoair commited on

refactor(parser): Make `get_all_text` method 40% faster
87d9204

Karim shoair commited on

refactor(Adaptor): Cleaner approach to `find_similar` method
31f70c8

Karim shoair commited on

fix(Adaptor): Add cleanup function to handle possible memory leak
2110f64

Karim shoair commited on