11========
2- scrapely
2+ Scrapely
33========
44
55Scrapely is a library for extracting structured data from HTML pages. Given
66some example web pages and the data to be extracted, scrapely constructs a
77parser for all similar pages.
88
9+ How does Scrapely relate to `Scrapy `_?
10+ ======================================
11+
12+ Despite the similarity in their names, Scrapely and `Scrapy `_ are quite
13+ different things. The only similarity they share is that they both depend on
14+ `w3lib `_, and they are both maintained by the same group of developers (which
15+ is why both are hosted on the `same Github account `_).
16+
17+ Scrapy is an application framework for building web crawlers, while Scrapely is
18+ a library for extracting structured data from HTML pages. If anything, Scrapely
19+ is more similar to `BeautifulSoup `_ or `lxml `_ than Scrapy.
20+
21+ Scrapely doesn't depend on Scrapy nor the other way around. In fact, it is
22+ quite common to use Scrapy without Scrapely, and viceversa.
23+
24+ If you are looking for a complete crawler-scraper solution, there is (at least)
25+ one project called `Slybot `_ that integrates both, but you can definitely use
26+ Scrapely on other web crawlers since it's just a library.
27+
28+ Scrapy has a builtin extraction mechanism called `selectors `_ which (unlike
29+ Scrapely) is based on XPaths.
30+
931Usage (API)
1032===========
1133
@@ -111,10 +133,13 @@ To scrape another similar page with the already added templates::
111133Requirements
112134============
113135
114- * numpy
115- * w3lib
116- * simplejson or Python 2.6+
136+ Scrapely depends on the following libraries:
137+
138+ * numpy
139+ * w3lib
140+ * simplejson or Python 2.6+
117141
142+ Note that Scrapely **does not ** depend on `Scrapy `_ in any way.
118143
119144Installation
120145============
@@ -131,7 +156,6 @@ And then install scrapely with::
131156
132157 aptitude install python-scrapely
133158
134-
135159Architecture
136160============
137161
@@ -165,3 +189,11 @@ License
165189=======
166190
167191Scrapely library is licensed under the BSD license.
192+
193+ .. _Scrapy : http://scrapy.org/
194+ .. _w3lib : https://github.com/scrapy/w3lib
195+ .. _BeautifulSoup : http://www.crummy.com/software/BeautifulSoup/
196+ .. _lxml : http://lxml.de/
197+ .. _same Github account : https://github.com/scrapy
198+ .. _slybot : https://github.com/scrapy/slybot
199+ .. _selectors : http://doc.scrapy.org/en/latest/topics/selectors.html
0 commit comments