1- # Playwright integration for Scrapy
1+ # scrapy-playwright: Playwright integration for Scrapy
22[ ![ version] ( https://img.shields.io/pypi/v/scrapy-playwright.svg )] ( https://pypi.python.org/pypi/scrapy-playwright )
33[ ![ pyversions] ( https://img.shields.io/pypi/pyversions/scrapy-playwright.svg )] ( https://pypi.python.org/pypi/scrapy-playwright )
44[ ![ Tests] ( https://github.com/scrapy-plugins/scrapy-playwright/actions/workflows/tests.yml/badge.svg )] ( https://github.com/scrapy-plugins/scrapy-playwright/actions/workflows/tests.yml )
55[ ![ codecov] ( https://codecov.io/gh/scrapy-plugins/scrapy-playwright/branch/master/graph/badge.svg )] ( https://codecov.io/gh/scrapy-plugins/scrapy-playwright )
66
77
8- This project provides a Scrapy Download Handler which performs requests using
9- [ Playwright for Python] ( https://github.com/microsoft/playwright-python ) . It can be used to handle
10- pages that require JavaScript. This package does not interfere with regular
11- Scrapy workflows such as request scheduling or item processing.
8+ A Scrapy Download Handler which performs requests using
9+ [ Playwright for Python] ( https://github.com/microsoft/playwright-python ) .
10+ It can be used to handle pages that require JavaScript (among other things),
11+ while adhering to the regular Scrapy workflow (i.e. without interfering
12+ with request scheduling, item processing, etc).
1213
1314
14- ## Motivation
15+ ## Requirements
1516
1617After the release of [ version 2.0] ( https://docs.scrapy.org/en/latest/news.html#scrapy-2-0-0-2020-03-03 ) ,
17- which includes partial [ coroutine syntax support] ( https://docs.scrapy.org/en/2.0/topics/coroutines.html )
18- and experimental [ asyncio support] ( https://docs.scrapy.org/en/2.0/topics/asyncio.html ) , Scrapy allows
18+ which includes [ coroutine syntax support] ( https://docs.scrapy.org/en/2.0/topics/coroutines.html )
19+ and [ asyncio support] ( https://docs.scrapy.org/en/2.0/topics/asyncio.html ) , Scrapy allows
1920to integrate ` asyncio ` -based projects such as ` Playwright ` .
2021
2122
22- ## Requirements
23+ ### Minimum required versions
2324
2425* Python >= 3.7
2526* Scrapy >= 2.0 (!= 2.4.0)
@@ -50,8 +51,8 @@ DOWNLOAD_HANDLERS = {
5051```
5152
5253Note that the ` ScrapyPlaywrightDownloadHandler ` class inherits from the default
53- ` http/https ` handler, and it will only use Playwright for requests that are
54- explicitly marked (see the "Basic usage" section for details) .
54+ ` http/https ` handler. Unless explicitly marked (see the "Basic usage"),
55+ requests will be processed by the regular Scrapy download handler .
5556
5657Also, be sure to [ install the ` asyncio ` -based Twisted reactor] ( https://docs.scrapy.org/en/latest/topics/asyncio.html#installing-the-asyncio-reactor ) :
5758
@@ -71,13 +72,6 @@ TWISTED_REACTOR = "twisted.internet.asyncioreactor.AsyncioSelectorReactor"
7172 A dictionary with options to be passed when launching the Browser.
7273 See the docs for [ ` BrowserType.launch ` ] ( https://playwright.dev/python/docs/api/class-browsertype#browser_typelaunchkwargs ) .
7374
74- * ` PLAYWRIGHT_CONTEXT_ARGS ` (type ` dict ` , default ` {} ` )
75-
76- A dictionary with default keyword arguments to be passed when creating the
77- "default" Browser context.
78-
79- ** Deprecated: use ` PLAYWRIGHT_CONTEXTS ` instead**
80-
8175* ` PLAYWRIGHT_CONTEXTS ` (type ` dict[str, dict] ` , default ` {} ` )
8276
8377 A dictionary which defines Browser contexts to be created on startup.
@@ -176,7 +170,6 @@ the callback needs to be defined as a coroutine function (`async def`).
176170
177171```python
178172import scrapy
179- import playwright
180173
181174class AwesomeSpiderWithPage(scrapy.Spider):
182175 name = " page"
@@ -477,3 +470,15 @@ For more examples, please see the scripts in the [examples](examples) directory.
477470
478471* Specifying a proxy via the `proxy` Request meta key is not supported.
479472 Refer to the [Proxy support](# proxy-support) section for more information.
473+
474+
475+ # # Deprecations
476+
477+ * `PLAYWRIGHT_CONTEXT_ARGS ` setting (type `dict ` , default `{}` )
478+
479+ A dictionary with default keyword arguments to be
480+ used when creating the " default" Browser context.
481+
482+ Deprecated since
483+ [`v0.0.4` ](https:// github.com/ scrapy- plugins/ scrapy- playwright/ releases/ tag/ v0.0.4),
484+ use the `PLAYWRIGHT_CONTEXTS ` setting instead
0 commit comments