Karim shoair commited on
Commit
60d0c55
·
1 Parent(s): c181b7d

docs: Update all docstring according to the new changes

Browse files
docs/fetching/dynamic.md CHANGED
@@ -77,7 +77,7 @@ Scrapling provides many options with this fetcher. To make it as simple as possi
77
  | network_idle | Wait for the page until there are no network connections for at least 500 ms. | ✔️ |
78
  | timeout | The timeout (milliseconds) used in all operations and waits through the page. The default is 30,000 ms (30 seconds). | ✔️ |
79
  | wait | The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the `Response` object. | ✔️ |
80
- | page_action | Added for automation. Pass a function that takes the `page` object and does the necessary automation, then returns `page` again. | ✔️ |
81
  | wait_selector | Wait for a specific css selector to be in a specific state. | ✔️ |
82
  | init_script | An absolute path to a JavaScript file to be executed on page creation for all pages in this session. | ✔️ |
83
  | wait_selector_state | Scrapling will wait for the given state to be fulfilled for the selector given with `wait_selector`. _Default state is `attached`._ | ✔️ |
@@ -134,7 +134,6 @@ def scroll_page(page: Page):
134
  page.mouse.wheel(10, 0)
135
  page.mouse.move(100, 400)
136
  page.mouse.up()
137
- return page
138
 
139
  page = DynamicFetcher.fetch(
140
  'https://example.com',
@@ -149,7 +148,6 @@ async def scroll_page(page: Page):
149
  await page.mouse.wheel(10, 0)
150
  await page.mouse.move(100, 400)
151
  await page.mouse.up()
152
- return page
153
 
154
  page = await DynamicFetcher.async_fetch(
155
  'https://example.com',
@@ -273,9 +271,14 @@ async def scrape_multiple_sites():
273
  return pages
274
  ```
275
 
276
- You may have noticed the `max_pages` argument. This is a new argument that enables the fetcher to create a **pool of Browser tabs** that will be rotated automatically. Instead of waiting for one browser tab to become ready, it checks if the next tab in the pool is ready to be used and uses it. This allows for multiple websites to be fetched at the same time in the same browser, which saves a lot of resources, but most importantly, is so fast :)
277
 
278
- When all tabs inside the pool are busy, the fetcher checks every subsecond if a tab becomes ready. If none become free within a 30-second interval, it raises a `TimeoutError` error. This can happen when the website you are fetching becomes unresponsive for some reason.
 
 
 
 
 
279
 
280
  ### Session Benefits
281
 
 
77
  | network_idle | Wait for the page until there are no network connections for at least 500 ms. | ✔️ |
78
  | timeout | The timeout (milliseconds) used in all operations and waits through the page. The default is 30,000 ms (30 seconds). | ✔️ |
79
  | wait | The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the `Response` object. | ✔️ |
80
+ | page_action | Added for automation. Pass a function that takes the `page` object and does the necessary automation. | ✔️ |
81
  | wait_selector | Wait for a specific css selector to be in a specific state. | ✔️ |
82
  | init_script | An absolute path to a JavaScript file to be executed on page creation for all pages in this session. | ✔️ |
83
  | wait_selector_state | Scrapling will wait for the given state to be fulfilled for the selector given with `wait_selector`. _Default state is `attached`._ | ✔️ |
 
134
  page.mouse.wheel(10, 0)
135
  page.mouse.move(100, 400)
136
  page.mouse.up()
 
137
 
138
  page = DynamicFetcher.fetch(
139
  'https://example.com',
 
148
  await page.mouse.wheel(10, 0)
149
  await page.mouse.move(100, 400)
150
  await page.mouse.up()
 
151
 
152
  page = await DynamicFetcher.async_fetch(
153
  'https://example.com',
 
271
  return pages
272
  ```
273
 
274
+ You may have noticed the `max_pages` argument. This is a new argument that enables the fetcher to create a **pool of Browser tabs** that will be rotated automatically. Instead of using one tab for all your requests, you set a limit of the maximum number of pages allowed and with each request, the library will close all tabs that finished its task and check if the number of the current tabs is lower than the number of maximum allowed number of pages/tabs then:
275
 
276
+ 1. If you are within the allowed range, the fetcher will create a new tab for you and then all is as normal.
277
+ 2. Otherwise, it will keep checking every sub second if creating a new tab is allowed or not for 60 seconds then raise `TimeoutError`. This can happen when the website you are fetching becomes unresponsive for some reason.
278
+
279
+ This logic allows for multiple websites to be fetched at the same time in the same browser, which saves a lot of resources, but most importantly, is so fast :)
280
+
281
+ In versions 0.3 and 0.3.1, the pool was reusing finished tabs to save more resources/time but this logic proved to have flaws since it's nearly impossible to protections pages/tabs from contamination of the previous configuration you used with the request before this one.
282
 
283
  ### Session Benefits
284
 
docs/fetching/stealthy.md CHANGED
@@ -31,7 +31,7 @@ Before jumping to [examples](#examples), here's the full list of arguments
31
  | google_search | Enabled by default, Scrapling will set the referer header as if this request came from a Google search of this website's domain name. | ✔️ |
32
  | extra_headers | A dictionary of extra headers to add to the request. _The referer set by the `google_search` argument takes priority over the referer set here if used together._ | ✔️ |
33
  | block_webrtc | Blocks WebRTC entirely. | ✔️ |
34
- | page_action | Added for automation. Pass a function that takes the `page` object and does the necessary automation, then returns `page` again. | ✔️ |
35
  | addons | List of Firefox addons to use. **Must be paths to extracted addons.** | ✔️ |
36
  | humanize | Humanize the cursor movement. The cursor movement takes either True or the maximum duration in seconds. The cursor typically takes up to 1.5 seconds to move across the window. | ✔️ |
37
  | allow_webgl | Enabled by default. Disabling WebGL is not recommended, as many WAFs now check if WebGL is enabled. | ✔️ |
@@ -156,7 +156,6 @@ def scroll_page(page: Page):
156
  page.mouse.wheel(10, 0)
157
  page.mouse.move(100, 400)
158
  page.mouse.up()
159
- return page
160
 
161
  page = StealthyFetcher.fetch(
162
  'https://example.com',
@@ -171,7 +170,6 @@ async def scroll_page(page: Page):
171
  await page.mouse.wheel(10, 0)
172
  await page.mouse.move(100, 400)
173
  await page.mouse.up()
174
- return page
175
 
176
  page = await StealthyFetcher.async_fetch(
177
  'https://example.com',
@@ -278,9 +276,14 @@ async def scrape_multiple_sites():
278
  return pages
279
  ```
280
 
281
- You may have noticed the `max_pages` argument. This is a new argument that enables the fetcher to create a **pool of Browser tabs** that will be rotated automatically. Instead of waiting for one browser tab to become ready, it checks if the next tab in the pool is ready to be used and uses it. This allows for multiple websites to be fetched at the same time in the same browser, which saves a lot of resources, but most importantly, is so fast :)
282
 
283
- When all tabs inside the pool are busy, the fetcher checks every subsecond if a tab becomes ready. If none become free within a 30-second interval, it raises a `TimeoutError` error. This can happen when the website you are fetching becomes unresponsive for some reason.
 
 
 
 
 
284
 
285
  ### Session Benefits
286
 
 
31
  | google_search | Enabled by default, Scrapling will set the referer header as if this request came from a Google search of this website's domain name. | ✔️ |
32
  | extra_headers | A dictionary of extra headers to add to the request. _The referer set by the `google_search` argument takes priority over the referer set here if used together._ | ✔️ |
33
  | block_webrtc | Blocks WebRTC entirely. | ✔️ |
34
+ | page_action | Added for automation. Pass a function that takes the `page` object and does the necessary automation. | ✔️ |
35
  | addons | List of Firefox addons to use. **Must be paths to extracted addons.** | ✔️ |
36
  | humanize | Humanize the cursor movement. The cursor movement takes either True or the maximum duration in seconds. The cursor typically takes up to 1.5 seconds to move across the window. | ✔️ |
37
  | allow_webgl | Enabled by default. Disabling WebGL is not recommended, as many WAFs now check if WebGL is enabled. | ✔️ |
 
156
  page.mouse.wheel(10, 0)
157
  page.mouse.move(100, 400)
158
  page.mouse.up()
 
159
 
160
  page = StealthyFetcher.fetch(
161
  'https://example.com',
 
170
  await page.mouse.wheel(10, 0)
171
  await page.mouse.move(100, 400)
172
  await page.mouse.up()
 
173
 
174
  page = await StealthyFetcher.async_fetch(
175
  'https://example.com',
 
276
  return pages
277
  ```
278
 
279
+ You may have noticed the `max_pages` argument. This is a new argument that enables the fetcher to create a **pool of Browser tabs** that will be rotated automatically. Instead of using one tab for all your requests, you set a limit of the maximum number of pages allowed and with each request, the library will close all tabs that finished its task and check if the number of the current tabs is lower than the number of maximum allowed number of pages/tabs then:
280
 
281
+ 1. If you are within the allowed range, the fetcher will create a new tab for you and then all is as normal.
282
+ 2. Otherwise, it will keep checking every sub second if creating a new tab is allowed or not for 60 seconds then raise `TimeoutError`. This can happen when the website you are fetching becomes unresponsive for some reason.
283
+
284
+ This logic allows for multiple websites to be fetched at the same time in the same browser, which saves a lot of resources, but most importantly, is so fast :)
285
+
286
+ In versions 0.3 and 0.3.1, the pool was reusing finished tabs to save more resources/time but this logic proved to have flaws since it's nearly impossible to protections pages/tabs from contamination of the previous configuration you used with the request before this one.
287
 
288
  ### Session Benefits
289
 
scrapling/engines/_browsers/_camoufox.py CHANGED
@@ -119,7 +119,7 @@ class StealthySession(StealthySessionMixin, SyncSession):
119
  :param os_randomize: If enabled, Scrapling will randomize the OS fingerprints used. The default is Scrapling matching the fingerprints with the current OS.
120
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
121
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
122
- :param page_action: Added for automation. A function that takes the `page` object, does the automation you need, then returns `page` again.
123
  :param wait_selector: Wait for a specific CSS selector to be in a specific state.
124
  :param init_script: An absolute path to a JavaScript file to be executed on page creation for all pages in this session.
125
  :param geoip: Recommended to use with proxies; Automatically use IP's longitude, latitude, timezone, country, locale, and spoof the WebRTC IP address.
@@ -269,7 +269,7 @@ class StealthySession(StealthySessionMixin, SyncSession):
269
  :param google_search: Enabled by default, Scrapling will set the referer header to be as if this request came from a Google search of this website's domain name.
270
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
271
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
272
- :param page_action: Added for automation. A function that takes the `page` object, does the automation you need, then returns `page` again.
273
  :param extra_headers: A dictionary of extra headers to add to the request. _The referer set by the `google_search` argument takes priority over the referer set here if used together._
274
  :param disable_resources: Drop requests of unnecessary resources for a speed boost. It depends, but it made requests ~25% faster in my tests for some websites.
275
  Requests dropped are of type `font`, `image`, `media`, `beacon`, `object`, `imageset`, `texttrack`, `websocket`, `csp_report`, and `stylesheet`.
@@ -433,7 +433,7 @@ class AsyncStealthySession(StealthySessionMixin, AsyncSession):
433
  :param os_randomize: If enabled, Scrapling will randomize the OS fingerprints used. The default is Scrapling matching the fingerprints with the current OS.
434
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
435
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
436
- :param page_action: Added for automation. A function that takes the `page` object, does the automation you need, then returns `page` again.
437
  :param wait_selector: Wait for a specific CSS selector to be in a specific state.
438
  :param init_script: An absolute path to a JavaScript file to be executed on page creation for all pages in this session.
439
  :param geoip: Recommended to use with proxies; Automatically use IP's longitude, latitude, timezone, country, locale, and spoof the WebRTC IP address.
@@ -585,7 +585,7 @@ class AsyncStealthySession(StealthySessionMixin, AsyncSession):
585
  :param google_search: Enabled by default, Scrapling will set the referer header to be as if this request came from a Google search of this website's domain name.
586
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
587
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
588
- :param page_action: Added for automation. A function that takes the `page` object, does the automation you need, then returns `page` again.
589
  :param extra_headers: A dictionary of extra headers to add to the request. _The referer set by the `google_search` argument takes priority over the referer set here if used together._
590
  :param disable_resources: Drop requests of unnecessary resources for a speed boost. It depends, but it made requests ~25% faster in my tests for some websites.
591
  Requests dropped are of type `font`, `image`, `media`, `beacon`, `object`, `imageset`, `texttrack`, `websocket`, `csp_report`, and `stylesheet`.
 
119
  :param os_randomize: If enabled, Scrapling will randomize the OS fingerprints used. The default is Scrapling matching the fingerprints with the current OS.
120
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
121
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
122
+ :param page_action: Added for automation. A function that takes the `page` object and does the automation you need.
123
  :param wait_selector: Wait for a specific CSS selector to be in a specific state.
124
  :param init_script: An absolute path to a JavaScript file to be executed on page creation for all pages in this session.
125
  :param geoip: Recommended to use with proxies; Automatically use IP's longitude, latitude, timezone, country, locale, and spoof the WebRTC IP address.
 
269
  :param google_search: Enabled by default, Scrapling will set the referer header to be as if this request came from a Google search of this website's domain name.
270
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
271
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
272
+ :param page_action: Added for automation. A function that takes the `page` object and does the automation you need.
273
  :param extra_headers: A dictionary of extra headers to add to the request. _The referer set by the `google_search` argument takes priority over the referer set here if used together._
274
  :param disable_resources: Drop requests of unnecessary resources for a speed boost. It depends, but it made requests ~25% faster in my tests for some websites.
275
  Requests dropped are of type `font`, `image`, `media`, `beacon`, `object`, `imageset`, `texttrack`, `websocket`, `csp_report`, and `stylesheet`.
 
433
  :param os_randomize: If enabled, Scrapling will randomize the OS fingerprints used. The default is Scrapling matching the fingerprints with the current OS.
434
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
435
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
436
+ :param page_action: Added for automation. A function that takes the `page` object and does the automation you need.
437
  :param wait_selector: Wait for a specific CSS selector to be in a specific state.
438
  :param init_script: An absolute path to a JavaScript file to be executed on page creation for all pages in this session.
439
  :param geoip: Recommended to use with proxies; Automatically use IP's longitude, latitude, timezone, country, locale, and spoof the WebRTC IP address.
 
585
  :param google_search: Enabled by default, Scrapling will set the referer header to be as if this request came from a Google search of this website's domain name.
586
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
587
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
588
+ :param page_action: Added for automation. A function that takes the `page` object and does the automation you need.
589
  :param extra_headers: A dictionary of extra headers to add to the request. _The referer set by the `google_search` argument takes priority over the referer set here if used together._
590
  :param disable_resources: Drop requests of unnecessary resources for a speed boost. It depends, but it made requests ~25% faster in my tests for some websites.
591
  Requests dropped are of type `font`, `image`, `media`, `beacon`, `object`, `imageset`, `texttrack`, `websocket`, `csp_report`, and `stylesheet`.
scrapling/engines/_browsers/_controllers.py CHANGED
@@ -106,7 +106,7 @@ class DynamicSession(DynamicSessionMixin, SyncSession):
106
  :param network_idle: Wait for the page until there are no network connections for at least 500 ms.
107
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
108
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
109
- :param page_action: Added for automation. A function that takes the `page` object, does the automation you need, then returns `page` again.
110
  :param wait_selector: Wait for a specific CSS selector to be in a specific state.
111
  :param init_script: An absolute path to a JavaScript file to be executed on page creation for all pages in this session.
112
  :param locale: Set the locale for the browser if wanted. The default value is `en-US`.
@@ -217,7 +217,7 @@ class DynamicSession(DynamicSessionMixin, SyncSession):
217
  :param google_search: Enabled by default, Scrapling will set the referer header to be as if this request came from a Google search of this website's domain name.
218
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
219
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
220
- :param page_action: Added for automation. A function that takes the `page` object, does the automation you need, then returns `page` again.
221
  :param extra_headers: A dictionary of extra headers to add to the request. _The referer set by the `google_search` argument takes priority over the referer set here if used together._
222
  :param disable_resources: Drop requests of unnecessary resources for a speed boost. It depends, but it made requests ~25% faster in my tests for some websites.
223
  Requests dropped are of type `font`, `image`, `media`, `beacon`, `object`, `imageset`, `texttrack`, `websocket`, `csp_report`, and `stylesheet`.
@@ -360,7 +360,7 @@ class AsyncDynamicSession(DynamicSessionMixin, AsyncSession):
360
  :param network_idle: Wait for the page until there are no network connections for at least 500 ms.
361
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
362
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
363
- :param page_action: Added for automation. A function that takes the `page` object, does the automation you need, then returns `page` again.
364
  :param wait_selector: Wait for a specific CSS selector to be in a specific state.
365
  :param init_script: An absolute path to a JavaScript file to be executed on page creation for all pages in this session.
366
  :param locale: Set the locale for the browser if wanted. The default value is `en-US`.
@@ -478,7 +478,7 @@ class AsyncDynamicSession(DynamicSessionMixin, AsyncSession):
478
  :param google_search: Enabled by default, Scrapling will set the referer header to be as if this request came from a Google search of this website's domain name.
479
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
480
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
481
- :param page_action: Added for automation. A function that takes the `page` object, does the automation you need, then returns `page` again.
482
  :param extra_headers: A dictionary of extra headers to add to the request. _The referer set by the `google_search` argument takes priority over the referer set here if used together._
483
  :param disable_resources: Drop requests of unnecessary resources for a speed boost. It depends, but it made requests ~25% faster in my tests for some websites.
484
  Requests dropped are of type `font`, `image`, `media`, `beacon`, `object`, `imageset`, `texttrack`, `websocket`, `csp_report`, and `stylesheet`.
 
106
  :param network_idle: Wait for the page until there are no network connections for at least 500 ms.
107
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
108
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
109
+ :param page_action: Added for automation. A function that takes the `page` object and does the automation you need.
110
  :param wait_selector: Wait for a specific CSS selector to be in a specific state.
111
  :param init_script: An absolute path to a JavaScript file to be executed on page creation for all pages in this session.
112
  :param locale: Set the locale for the browser if wanted. The default value is `en-US`.
 
217
  :param google_search: Enabled by default, Scrapling will set the referer header to be as if this request came from a Google search of this website's domain name.
218
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
219
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
220
+ :param page_action: Added for automation. A function that takes the `page` object and does the automation you need.
221
  :param extra_headers: A dictionary of extra headers to add to the request. _The referer set by the `google_search` argument takes priority over the referer set here if used together._
222
  :param disable_resources: Drop requests of unnecessary resources for a speed boost. It depends, but it made requests ~25% faster in my tests for some websites.
223
  Requests dropped are of type `font`, `image`, `media`, `beacon`, `object`, `imageset`, `texttrack`, `websocket`, `csp_report`, and `stylesheet`.
 
360
  :param network_idle: Wait for the page until there are no network connections for at least 500 ms.
361
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
362
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
363
+ :param page_action: Added for automation. A function that takes the `page` object and does the automation you need.
364
  :param wait_selector: Wait for a specific CSS selector to be in a specific state.
365
  :param init_script: An absolute path to a JavaScript file to be executed on page creation for all pages in this session.
366
  :param locale: Set the locale for the browser if wanted. The default value is `en-US`.
 
478
  :param google_search: Enabled by default, Scrapling will set the referer header to be as if this request came from a Google search of this website's domain name.
479
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
480
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
481
+ :param page_action: Added for automation. A function that takes the `page` object and does the automation you need.
482
  :param extra_headers: A dictionary of extra headers to add to the request. _The referer set by the `google_search` argument takes priority over the referer set here if used together._
483
  :param disable_resources: Drop requests of unnecessary resources for a speed boost. It depends, but it made requests ~25% faster in my tests for some websites.
484
  Requests dropped are of type `font`, `image`, `media`, `beacon`, `object`, `imageset`, `texttrack`, `websocket`, `csp_report`, and `stylesheet`.
scrapling/fetchers.py CHANGED
@@ -96,7 +96,7 @@ class StealthyFetcher(BaseFetcher):
96
  :param os_randomize: If enabled, Scrapling will randomize the OS fingerprints used. The default is Scrapling matching the fingerprints with the current OS.
97
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
98
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
99
- :param page_action: Added for automation. A function that takes the `page` object, does the automation you need, then returns `page` again.
100
  :param wait_selector: Wait for a specific CSS selector to be in a specific state.
101
  :param init_script: An absolute path to a JavaScript file to be executed on page creation with this request.
102
  :param geoip: Recommended to use with proxies; Automatically use IP's longitude, latitude, timezone, country, locale, and spoof the WebRTC IP address.
@@ -194,7 +194,7 @@ class StealthyFetcher(BaseFetcher):
194
  :param os_randomize: If enabled, Scrapling will randomize the OS fingerprints used. The default is Scrapling matching the fingerprints with the current OS.
195
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
196
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
197
- :param page_action: Added for automation. A function that takes the `page` object, does the automation you need, then returns `page` again.
198
  :param wait_selector: Wait for a specific CSS selector to be in a specific state.
199
  :param init_script: An absolute path to a JavaScript file to be executed on page creation with this request.
200
  :param geoip: Recommended to use with proxies; Automatically use IP's longitude, latitude, timezone, country, locale, and spoof the WebRTC IP address.
@@ -299,7 +299,7 @@ class DynamicFetcher(BaseFetcher):
299
  :param network_idle: Wait for the page until there are no network connections for at least 500 ms.
300
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
301
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
302
- :param page_action: Added for automation. A function that takes the `page` object, does the automation you need, then returns `page` again.
303
  :param wait_selector: Wait for a specific CSS selector to be in a specific state.
304
  :param init_script: An absolute path to a JavaScript file to be executed on page creation with this request.
305
  :param locale: Set the locale for the browser if wanted. The default value is `en-US`.
@@ -385,7 +385,7 @@ class DynamicFetcher(BaseFetcher):
385
  :param network_idle: Wait for the page until there are no network connections for at least 500 ms.
386
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
387
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
388
- :param page_action: Added for automation. A function that takes the `page` object, does the automation you need, then returns `page` again.
389
  :param wait_selector: Wait for a specific CSS selector to be in a specific state.
390
  :param init_script: An absolute path to a JavaScript file to be executed on page creation with this request.
391
  :param locale: Set the locale for the browser if wanted. The default value is `en-US`.
 
96
  :param os_randomize: If enabled, Scrapling will randomize the OS fingerprints used. The default is Scrapling matching the fingerprints with the current OS.
97
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
98
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
99
+ :param page_action: Added for automation. A function that takes the `page` object and does the automation you need.
100
  :param wait_selector: Wait for a specific CSS selector to be in a specific state.
101
  :param init_script: An absolute path to a JavaScript file to be executed on page creation with this request.
102
  :param geoip: Recommended to use with proxies; Automatically use IP's longitude, latitude, timezone, country, locale, and spoof the WebRTC IP address.
 
194
  :param os_randomize: If enabled, Scrapling will randomize the OS fingerprints used. The default is Scrapling matching the fingerprints with the current OS.
195
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
196
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
197
+ :param page_action: Added for automation. A function that takes the `page` object and does the automation you need.
198
  :param wait_selector: Wait for a specific CSS selector to be in a specific state.
199
  :param init_script: An absolute path to a JavaScript file to be executed on page creation with this request.
200
  :param geoip: Recommended to use with proxies; Automatically use IP's longitude, latitude, timezone, country, locale, and spoof the WebRTC IP address.
 
299
  :param network_idle: Wait for the page until there are no network connections for at least 500 ms.
300
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
301
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
302
+ :param page_action: Added for automation. A function that takes the `page` object and does the automation you need.
303
  :param wait_selector: Wait for a specific CSS selector to be in a specific state.
304
  :param init_script: An absolute path to a JavaScript file to be executed on page creation with this request.
305
  :param locale: Set the locale for the browser if wanted. The default value is `en-US`.
 
385
  :param network_idle: Wait for the page until there are no network connections for at least 500 ms.
386
  :param timeout: The timeout in milliseconds that is used in all operations and waits through the page. The default is 30,000
387
  :param wait: The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the ` Response ` object.
388
+ :param page_action: Added for automation. A function that takes the `page` object and does the automation you need.
389
  :param wait_selector: Wait for a specific CSS selector to be in a specific state.
390
  :param init_script: An absolute path to a JavaScript file to be executed on page creation with this request.
391
  :param locale: Set the locale for the browser if wanted. The default value is `en-US`.