Karim shoair commited on
Commit
5df81ed
·
1 Parent(s): edde4f1

docs: update the DynamicFetcher page

Browse files
Files changed (1) hide show
  1. docs/fetching/dynamic.md +115 -72
docs/fetching/dynamic.md CHANGED
@@ -1,57 +1,57 @@
1
  # Introduction
2
 
3
- Here, we will discuss the `PlayWrightFetcher` class. This class provides flexible browser automation with multiple configuration options and some stealth capabilities. It uses [PlayWright](https://playwright.dev/python/docs/intro) as an engine for fetching websites.
4
 
5
- As we will explain later, to automate the page, you need some knowledge of [PlayWright's Page API](https://playwright.dev/python/docs/api/class-page).
6
 
7
  ## Basic Usage
8
  You have one primary way to import this Fetcher, which is the same for all fetchers.
9
 
10
  ```python
11
- >>> from scrapling.fetchers import PlayWrightFetcher
12
  ```
13
  Check out how to configure the parsing options [here](choosing.md#parser-configuration-in-all-fetchers)
14
 
15
- Now we will go over most of the arguments one by one with examples if you want to jump to a table of all arguments for quick reference [click here](#full-list-of-arguments)
16
 
17
  > Notes:
18
  >
19
- > 1. Every time you fetch a website with this fetcher, it waits by default for all JavaScript to fully load and execute, so you don't have to (waits for the `domcontentloaded` state).
20
  > 2. Of course, the async version of the `fetch` method is the `async_fetch` method.
21
 
22
 
23
- This fetcher currently provides 4 main run options, but they can be mixed as you want.
24
 
25
  Which are:
26
 
27
  ### 1. Vanilla Playwright
28
  ```python
29
- PlayWrightFetcher.fetch('https://example.com')
30
  ```
31
- Using it like that will open a Chromium browser and fetch the page. There are no tricks or extra features; it's just a plain PlayWright API.
32
 
33
  ### 2. Stealth Mode
34
  ```python
35
- PlayWrightFetcher.fetch('https://example.com', stealth=True)
36
  ```
37
- It's the same as the vanilla PlayWright option, but it provides a simple stealth mode suitable for websites with a small-to-medium protection layer(s).
38
 
39
  Some of the things this fetcher's stealth mode does include:
40
 
41
  * Patching the CDP runtime fingerprint.
42
  * Mimics some of the real browsers' properties by injecting several JS files and using custom options.
43
  * Custom flags are used on launch to hide Playwright even more and make it faster.
44
- * Generates real browser headers of the same type and user OS, then append them to the request's headers.
45
 
46
  ### 3. Real Chrome
47
  ```python
48
- PlayWrightFetcher.fetch('https://example.com', real_chrome=True)
49
  ```
50
- If you have a Google Chrome browser installed, use this option. It's the same as the first option but will use the Google Chrome browser you installed on your device instead of Chromium.
51
 
52
- This will make your requests look more like requests coming from an actual human, so it's less detectable, and you can even use the `stealth=True` mode with it for better results like below:
53
  ```python
54
- PlayWrightFetcher.fetch('https://example.com', real_chrome=True, stealth=True)
55
  ```
56
  If you don't have Google Chrome installed and want to use this option, you can use the command below in the terminal to install it for the library instead of installing it manually:
57
  ```commandline
@@ -60,52 +60,45 @@ playwright install chrome
60
 
61
  ### 4. CDP Connection
62
  ```python
63
- PlayWrightFetcher.fetch('https://example.com', cdp_url='ws://localhost:9222')
64
  ```
65
  Instead of launching a browser locally (Chromium/Google Chrome), you can connect to a remote browser through the [Chrome DevTools Protocol](https://chromedevtools.github.io/devtools-protocol/).
66
 
67
- This fetcher takes it even a step further. You can use [NSTBrowser](https://app.nstbrowser.io/r/1vO5e5)'s [docker browserless](https://hub.docker.com/r/nstbrowser/browserless) option by passing the CDP URL and enabling `nstbrowser_mode` option like below
68
- ```python
69
- PlayWrightFetcher.fetch('https://example.com', cdp_url='ws://localhost:9222', nstbrowser_mode=True)
70
- ```
71
- There's also a `nstbrowser_config` argument to send the config you want to send with the requests to the NSTBrowser. If you leave it empty, Scrapling defaults to an optimized NSTBrowser's docker browserless config.
72
-
73
  ## Full list of arguments
74
- Scrapling provides many options with this fetcher, which works in all modes except the [NSTBrowser](https://app.nstbrowser.io/r/1vO5e5) mode. To make it as simple as possible, we will list the options here and give examples of using most of them.
75
-
76
- | Argument | Description | Optional |
77
- |:-------------------:|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------:|
78
- | url | Target url | ❌ |
79
- | headless | Pass `True` to run the browser in headless/hidden (**default**) or `False` for headful/visible mode. | ✔️ |
80
- | disable_resources | Drop requests of unnecessary resources for a speed boost. It depends, but it made requests ~25% faster in my tests for some websites.<br/>Requests dropped are of type `font`, `image`, `media`, `beacon`, `object`, `imageset`, `texttrack`, `websocket`, `csp_report`, and `stylesheet`. _This can help save your proxy usage, but be careful with this option as it makes some websites never finish loading._ | ✔️ |
81
- | useragent | Pass a useragent string to be used. **Otherwise, the fetcher will generate and use a real Useragent of the same browser.** | ✔️ |
82
- | network_idle | Wait for the page until there are no network connections for at least 500 ms. | ✔️ |
83
- | timeout | The timeout (milliseconds) used in all operations and waits through the page. The default is 30000. | ✔️ |
84
- | wait | The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the `Response` object. | ✔️ |
85
- | page_action | Added for automation. Pass a function that takes the `page` object and does the necessary automation, then returns `page` again. | ✔️ |
86
- | wait_selector | Wait for a specific css selector to be in a specific state. | ✔️ |
87
- | wait_selector_state | Scrapling will wait for the given state to be fulfilled for the selector given with `wait_selector`. _Default state is `attached`._ | ✔️ |
88
- | google_search | Enabled by default, Scrapling will set the referer header as if this request came from a Google search for this website's domain name. | ✔️ |
89
- | extra_headers | A dictionary of extra headers to add to the request. The referer set by the `google_search` argument takes priority over the referer set here if used together. | ✔️ |
90
- | proxy | The proxy to be used with requests. It can be a string or a dictionary with the keys 'server', 'username', and 'password' only. | ✔️ |
91
- | hide_canvas | Add random noise to canvas operations to prevent fingerprinting. | ✔️ |
92
- | disable_webgl | Disables WebGL and WebGL 2.0 support entirely. | ✔️ |
93
- | stealth | Enables stealth mode; you should always check the documentation to see what stealth mode does currently. | ✔️ |
94
- | real_chrome | If you have a Chrome browser installed on your device, enable this, and the Fetcher will launch and use an instance of your browser and use it. | ✔️ |
95
- | locale | Set the locale for the browser if wanted. The default value is `en-US`. | ✔️ |
96
- | cdp_url | Instead of launching a new browser instance, connect to this CDP URL to control real browsers/NSTBrowser through CDP. | ✔️ |
97
- | nstbrowser_mode | Enables NSTBrowser mode, **it have to be used with `cdp_url` argument or it will get completely ignored.** | ✔️ |
98
- | nstbrowser_config | The config you want to send with requests to the NSTBrowser. _Scrapling defaults to an optimized NSTBrowser's docker browserless config if you leave this argument empty._ | ✔️ |
99
-
100
 
101
  ## Examples
102
- It's easier to understand with examples, so let's look at it.
103
 
104
  ### Resource Control
105
 
106
  ```python
107
  # Disable unnecessary resources
108
- page = PlayWrightFetcher.fetch(
109
  'https://example.com',
110
  disable_resources=True # Blocks fonts, images, media, etc...
111
  )
@@ -115,22 +108,22 @@ page = PlayWrightFetcher.fetch(
115
 
116
  ```python
117
  # Wait for network idle (Consider fetch to be finished when there are no network connections for at least 500 ms)
118
- page = PlayWrightFetcher.fetch('https://example.com', network_idle=True)
119
 
120
  # Custom timeout (in milliseconds)
121
- page = PlayWrightFetcher.fetch('https://example.com', timeout=30000) # 30 seconds
122
 
123
  # Proxy support
124
- page = PlayWrightFetcher.fetch(
125
  'https://example.com',
126
  proxy='http://username:password@host:port' # Or it can be a dictionary with the keys 'server', 'username', and 'password' only
127
  )
128
  ```
129
 
130
  ### Browser Automation
131
- This is where your knowledge about [PlayWright's Page API](https://playwright.dev/python/docs/api/class-page) comes into play. The function you pass here takes the page object from Playwright's API, does what you want, and then returns it again for the current fetcher to continue working on it.
132
 
133
- This function is executed right after waiting for network_idle (if enabled) and before waiting for the `wait_selector` argument, so it can be used for many things, not just automation. You can alter the page as you want.
134
 
135
  In the example below, I used page [mouse events](https://playwright.dev/python/docs/api/class-mouse) to move the mouse wheel to scroll the page and then move the mouse.
136
  ```python
@@ -142,7 +135,7 @@ def scroll_page(page: Page):
142
  page.mouse.up()
143
  return page
144
 
145
- page = PlayWrightFetcher.fetch(
146
  'https://example.com',
147
  page_action=scroll_page
148
  )
@@ -157,7 +150,7 @@ async def scroll_page(page: Page):
157
  await page.mouse.up()
158
  return page
159
 
160
- page = await PlayWrightFetcher.async_fetch(
161
  'https://example.com',
162
  page_action=scroll_page
163
  )
@@ -167,7 +160,7 @@ page = await PlayWrightFetcher.async_fetch(
167
 
168
  ```python
169
  # Wait for the selector
170
- page = PlayWrightFetcher.fetch(
171
  'https://example.com',
172
  wait_selector='h1',
173
  wait_selector_state='visible'
@@ -175,20 +168,20 @@ page = PlayWrightFetcher.fetch(
175
  ```
176
  This is the last wait the fetcher will do before returning the response (if enabled). You pass a CSS selector to the `wait_selector` argument, and the fetcher will wait for the state you passed in the `wait_selector_state` argument to be fulfilled. If you didn't pass a state, the default would be `attached`, which means it will wait for the element to be present in the DOM.
177
 
178
- After that, the fetcher will check again to see if all JS files are loaded and executed (the `domcontentloaded` state) and wait for them to be. If you have enabled `network_idle` with this, the fetcher will wait for `network_idle` to be fulfilled again, as explained above.
179
 
180
- The states the fetcher can wait for can be either ([source](https://playwright.dev/python/docs/api/class-page#page-wait-for-selector)):
181
 
182
- - `attached`: Wait for an element to be present in DOM.
183
- - `detached`: Wait for an element to not be present in DOM.
184
  - `visible`: wait for an element to have a non-empty bounding box and no `visibility:hidden`. Note that an element without any content or with `display:none` has an empty bounding box and is not considered visible.
185
- - `hidden`: wait for an element to be either detached from DOM, or have an empty bounding box or `visibility:hidden`. This is opposite to the `'visible'` option.
186
 
187
  ### Some Stealth Features
188
 
189
  ```python
190
  # Full stealth mode
191
- page = PlayWrightFetcher.fetch(
192
  'https://example.com',
193
  stealth=True,
194
  hide_canvas=True,
@@ -197,28 +190,28 @@ page = PlayWrightFetcher.fetch(
197
  )
198
 
199
  # Custom user agent
200
- page = PlayWrightFetcher.fetch(
201
  'https://example.com',
202
  useragent='Mozilla/5.0...'
203
  )
204
 
205
  # Set browser locale
206
- page = PlayWrightFetcher.fetch(
207
  'https://example.com',
208
  locale='en-US'
209
  )
210
  ```
211
- Hence, the `hide_canvas` argument doesn't disable canvas but hides it by adding random noise to canvas operations to prevent fingerprinting. Also, if you didn't set a useragent (preferred), the fetcher will generate a real Useragent of the same browser and use it.
212
 
213
- The `google_search` argument is enabled by default, making the request look like it came from Google. So, a request for `https://example.com` will set the referer to `https://www.google.com/search?q=example`. Also, if used together, it takes priority over the referer set by the `extra_headers` argument.
214
 
215
  ### General example
216
  ```python
217
- from scrapling.fetchers import PlayWrightFetcher
218
 
219
  def scrape_dynamic_content():
220
- # Use PlayWright for JavaScript content
221
- page = PlayWrightFetcher.fetch(
222
  'https://example.com/dynamic',
223
  network_idle=True,
224
  wait_selector='.content'
@@ -235,9 +228,59 @@ def scrape_dynamic_content():
235
  }
236
  ```
237
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
238
  ## When to Use
239
 
240
- Use PlayWrightFetcher when:
241
 
242
  - Need browser automation
243
  - Want multiple browser options
 
1
  # Introduction
2
 
3
+ Here, we will discuss the `DynamicFetcher` class (previously known as `PlayWrightFetcher`). This class provides flexible browser automation with multiple configuration options and some stealth capabilities.
4
 
5
+ As we will explain later, to automate the page, you need some knowledge of [Playwright's Page API](https://playwright.dev/python/docs/api/class-page).
6
 
7
  ## Basic Usage
8
  You have one primary way to import this Fetcher, which is the same for all fetchers.
9
 
10
  ```python
11
+ >>> from scrapling.fetchers import DynamicFetcher
12
  ```
13
  Check out how to configure the parsing options [here](choosing.md#parser-configuration-in-all-fetchers)
14
 
15
+ Now, we will review most of the arguments one by one, using examples. If you want to jump to a table of all arguments for quick reference, [click here](#full-list-of-arguments)
16
 
17
  > Notes:
18
  >
19
+ > 1. Every time you fetch a website with this fetcher, it waits by default for all JavaScript to fully load and execute, so you don't have to (wait for the `domcontentloaded` state).
20
  > 2. Of course, the async version of the `fetch` method is the `async_fetch` method.
21
 
22
 
23
+ This fetcher currently provides four main run options, which can be mixed as desired.
24
 
25
  Which are:
26
 
27
  ### 1. Vanilla Playwright
28
  ```python
29
+ DynamicFetcher.fetch('https://example.com')
30
  ```
31
+ Using it in that manner will open a Chromium browser and load the page. There are no tricks or extra features unless you enable some; it's just a plain PlayWright API.
32
 
33
  ### 2. Stealth Mode
34
  ```python
35
+ DynamicFetcher.fetch('https://example.com', stealth=True)
36
  ```
37
+ It's the same as the vanilla Playwright option, but it provides a simple stealth mode suitable for websites with a small to medium protection layer(s).
38
 
39
  Some of the things this fetcher's stealth mode does include:
40
 
41
  * Patching the CDP runtime fingerprint.
42
  * Mimics some of the real browsers' properties by injecting several JS files and using custom options.
43
  * Custom flags are used on launch to hide Playwright even more and make it faster.
44
+ * Generates real browser headers of the same type and user OS, then appends them to the request's headers.
45
 
46
  ### 3. Real Chrome
47
  ```python
48
+ DynamicFetcher.fetch('https://example.com', real_chrome=True)
49
  ```
50
+ If you have a Google Chrome browser installed, use this option. It's the same as the first option, but will use the Google Chrome browser you installed on your device instead of Chromium.
51
 
52
+ This will make your requests look more authentic, so it's less detectable, and you can even use the `stealth=True` mode with it for better results, like below:
53
  ```python
54
+ DynamicFetcher.fetch('https://example.com', real_chrome=True, stealth=True)
55
  ```
56
  If you don't have Google Chrome installed and want to use this option, you can use the command below in the terminal to install it for the library instead of installing it manually:
57
  ```commandline
 
60
 
61
  ### 4. CDP Connection
62
  ```python
63
+ DynamicFetcher.fetch('https://example.com', cdp_url='ws://localhost:9222')
64
  ```
65
  Instead of launching a browser locally (Chromium/Google Chrome), you can connect to a remote browser through the [Chrome DevTools Protocol](https://chromedevtools.github.io/devtools-protocol/).
66
 
 
 
 
 
 
 
67
  ## Full list of arguments
68
+ Scrapling provides many options with this fetcher. To make it as simple as possible, we will list the options here and give examples of using most of them.
69
+
70
+ | Argument | Description | Optional |
71
+ |:-------------------:|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------:|
72
+ | url | Target url | ❌ |
73
+ | headless | Pass `True` to run the browser in headless/hidden (**default**) or `False` for headful/visible mode. | ✔️ |
74
+ | disable_resources | Drop requests of unnecessary resources for a speed boost. It depends, but it made requests ~25% faster in my tests for some websites.<br/>Requests dropped are of type `font`, `image`, `media`, `beacon`, `object`, `imageset`, `texttrack`, `websocket`, `csp_report`, and `stylesheet`. _This can help save your proxy usage, but be cautious with this option, as it may cause some websites to never finish loading._ | ✔️ |
75
+ | cookies | Set cookies for the next request. | ✔️ |
76
+ | useragent | Pass a useragent string to be used. **Otherwise, the fetcher will generate and use a real Useragent of the same browser.** | ✔️ |
77
+ | network_idle | Wait for the page until there are no network connections for at least 500 ms. | ✔️ |
78
+ | timeout | The timeout (milliseconds) used in all operations and waits through the page. The default is 30,000 ms (30 seconds). | ✔️ |
79
+ | wait | The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the `Response` object. | ✔️ |
80
+ | page_action | Added for automation. Pass a function that takes the `page` object and does the necessary automation, then returns `page` again. | ✔️ |
81
+ | wait_selector | Wait for a specific css selector to be in a specific state. | ✔️ |
82
+ | wait_selector_state | Scrapling will wait for the given state to be fulfilled for the selector given with `wait_selector`. _Default state is `attached`._ | ✔️ |
83
+ | google_search | Enabled by default, Scrapling will set the referer header as if this request came from a Google search of this website's domain name. | ✔️ |
84
+ | extra_headers | A dictionary of extra headers to add to the request. _The referer set by the `google_search` argument takes priority over the referer set here if used together._ | ✔️ |
85
+ | proxy | The proxy to be used with requests. It can be a string or a dictionary with the keys 'server', 'username', and 'password' only. | ✔️ |
86
+ | hide_canvas | Add random noise to canvas operations to prevent fingerprinting. | ✔️ |
87
+ | disable_webgl | Disables WebGL and WebGL 2.0 support entirely. | ✔️ |
88
+ | stealth | Enables stealth mode; you should always check the documentation to see what the stealth mode does currently. | ✔️ |
89
+ | real_chrome | If you have a Chrome browser installed on your device, enable this, and the Fetcher will launch and use an instance of your browser. | ✔️ |
90
+ | locale | Set the locale for the browser if wanted. The default value is `en-US`. | ✔️ |
91
+ | cdp_url | Instead of launching a new browser instance, connect to this CDP URL to control real browsers through CDP. | ✔️ |
92
+ | selector_config | A dictionary of custom parsing arguments to be used when creating the final `Selector`/`Response` class. | ✔️ |
 
93
 
94
  ## Examples
95
+ It's easier to understand with examples, so let's take a look.
96
 
97
  ### Resource Control
98
 
99
  ```python
100
  # Disable unnecessary resources
101
+ page = DynamicFetcher.fetch(
102
  'https://example.com',
103
  disable_resources=True # Blocks fonts, images, media, etc...
104
  )
 
108
 
109
  ```python
110
  # Wait for network idle (Consider fetch to be finished when there are no network connections for at least 500 ms)
111
+ page = DynamicFetcher.fetch('https://example.com', network_idle=True)
112
 
113
  # Custom timeout (in milliseconds)
114
+ page = DynamicFetcher.fetch('https://example.com', timeout=30000) # 30 seconds
115
 
116
  # Proxy support
117
+ page = DynamicFetcher.fetch(
118
  'https://example.com',
119
  proxy='http://username:password@host:port' # Or it can be a dictionary with the keys 'server', 'username', and 'password' only
120
  )
121
  ```
122
 
123
  ### Browser Automation
124
+ This is where your knowledge about [Playwright's Page API](https://playwright.dev/python/docs/api/class-page) comes into play. The function you pass here takes the page object from Playwright's API, performs the desired action, and then returns it for the current fetcher to continue working on it.
125
 
126
+ This function is executed immediately after waiting for `network_idle` (if enabled) and before waiting for the `wait_selector` argument, allowing it to be used for various purposes, not just automation. You can alter the page as you want.
127
 
128
  In the example below, I used page [mouse events](https://playwright.dev/python/docs/api/class-mouse) to move the mouse wheel to scroll the page and then move the mouse.
129
  ```python
 
135
  page.mouse.up()
136
  return page
137
 
138
+ page = DynamicFetcher.fetch(
139
  'https://example.com',
140
  page_action=scroll_page
141
  )
 
150
  await page.mouse.up()
151
  return page
152
 
153
+ page = await DynamicFetcher.async_fetch(
154
  'https://example.com',
155
  page_action=scroll_page
156
  )
 
160
 
161
  ```python
162
  # Wait for the selector
163
+ page = DynamicFetcher.fetch(
164
  'https://example.com',
165
  wait_selector='h1',
166
  wait_selector_state='visible'
 
168
  ```
169
  This is the last wait the fetcher will do before returning the response (if enabled). You pass a CSS selector to the `wait_selector` argument, and the fetcher will wait for the state you passed in the `wait_selector_state` argument to be fulfilled. If you didn't pass a state, the default would be `attached`, which means it will wait for the element to be present in the DOM.
170
 
171
+ After that, the fetcher will check again to see if all JS files are loaded and executed (the `domcontentloaded` state) or continue waiting. If you have enabled `network_idle` with this, the fetcher will wait for `network_idle` to be fulfilled again, as explained above.
172
 
173
+ The states the fetcher can wait for can be any of the following ([source](https://playwright.dev/python/docs/api/class-page#page-wait-for-selector)):
174
 
175
+ - `attached`: Wait for an element to be present in the DOM.
176
+ - `detached`: Wait for an element to not be present in the DOM.
177
  - `visible`: wait for an element to have a non-empty bounding box and no `visibility:hidden`. Note that an element without any content or with `display:none` has an empty bounding box and is not considered visible.
178
+ - `hidden`: wait for an element to be either detached from the DOM, or have an empty bounding box, or `visibility:hidden`. This is opposite to the `'visible'` option.
179
 
180
  ### Some Stealth Features
181
 
182
  ```python
183
  # Full stealth mode
184
+ page = DynamicFetcher.fetch(
185
  'https://example.com',
186
  stealth=True,
187
  hide_canvas=True,
 
190
  )
191
 
192
  # Custom user agent
193
+ page = DynamicFetcher.fetch(
194
  'https://example.com',
195
  useragent='Mozilla/5.0...'
196
  )
197
 
198
  # Set browser locale
199
+ page = DynamicFetcher.fetch(
200
  'https://example.com',
201
  locale='en-US'
202
  )
203
  ```
204
+ Hence, the `hide_canvas` argument doesn't disable the canvas but instead hides it by adding random noise to canvas operations, preventing fingerprinting. Also, if you didn't set a user agent (preferred), the fetcher will generate a real User Agent of the same browser and use it.
205
 
206
+ The `google_search` argument is enabled by default, making the request look as if it came from a Google search page. So, a request for `https://example.com` will set the referer to `https://www.google.com/search?q=example`. Also, if used together, it takes priority over the referer set by the `extra_headers` argument.
207
 
208
  ### General example
209
  ```python
210
+ from scrapling.fetchers import DynamicFetcher
211
 
212
  def scrape_dynamic_content():
213
+ # Use Playwright for JavaScript content
214
+ page = DynamicFetcher.fetch(
215
  'https://example.com/dynamic',
216
  network_idle=True,
217
  wait_selector='.content'
 
228
  }
229
  ```
230
 
231
+ ## Session Management
232
+
233
+ To keep the browser open until you make multiple requests with the same configuration, use `DynamicSession`/`AsyncDynamicSession` classes. Those classes can accept all the arguments that the `fetch` function can take, which enables you to specify a config for the entire session.
234
+
235
+ ```python
236
+ from scrapling.fetchers import DynamicSession
237
+
238
+ # Create a session with default configuration
239
+ with DynamicSession(
240
+ headless=True,
241
+ stealth=True,
242
+ disable_resources=True,
243
+ real_chrome=True
244
+ ) as session:
245
+ # Make multiple requests with the same browser instance
246
+ page1 = session.fetch('https://example1.com')
247
+ page2 = session.fetch('https://example2.com')
248
+ page3 = session.fetch('https://dynamic-site.com')
249
+
250
+ # All requests reuse the same tab on the same browser instance
251
+ ```
252
+
253
+ ### Async Session Usage
254
+
255
+ ```python
256
+ import asyncio
257
+ from scrapling.fetchers import AsyncDynamicSession
258
+
259
+ async def scrape_multiple_sites():
260
+ async with AsyncDynamicSession(
261
+ stealth=True,
262
+ network_idle=True,
263
+ timeout=30000
264
+ ) as session:
265
+ # Make async requests with shared browser configuration
266
+ pages = await asyncio.gather(
267
+ session.fetch('https://spa-app1.com'),
268
+ session.fetch('https://spa-app2.com'),
269
+ session.fetch('https://dynamic-content.com')
270
+ )
271
+ return pages
272
+ ```
273
+
274
+ ### Session Benefits
275
+
276
+ - **Browser reuse**: Much faster subsequent requests by reusing the same browser instance.
277
+ - **Cookie persistence**: Automatic cookie and session state handling as any browser does automatically.
278
+ - **Consistent fingerprint**: Same browser fingerprint across all requests.
279
+ - **Memory efficiency**: Better resource usage compared to launching new browsers with each fetch.
280
+
281
  ## When to Use
282
 
283
+ Use DynamicFetcher when:
284
 
285
  - Need browser automation
286
  - Want multiple browser options