Skip to content

Commit db02435

Browse files
committed
backtrack
1 parent d381223 commit db02435

File tree

9 files changed

+4
-33
lines changed

9 files changed

+4
-33
lines changed

content/academy/advanced_web_scraping.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,4 +17,4 @@ Just like the [**Web scraping for beginners**]({{@link web_scraping_for_beginner
1717

1818
## [](#first-up) First up
1919

20-
This course's [first lesson]({{@link advanced_web_scraping/crawling/scraping_paginated_sites.md}}) dives head-first into one of the most valuable skills you can have as a scraper developer: **Scraping paginated sites**.
20+
This course's [first lesson]({{@link advanced_web_scraping/scraping_paginated_sites.md}}) dives head-first into one of the most valuable skills you can have as a scraper developer: **Scraping paginated sites**.

content/academy/advanced_web_scraping/crawling.md

Lines changed: 0 additions & 10 deletions
This file was deleted.

content/academy/advanced_web_scraping/data_collection.md

Lines changed: 0 additions & 10 deletions
This file was deleted.

content/academy/advanced_web_scraping/data_collection/mobile_app_scraping.md

Lines changed: 0 additions & 9 deletions
This file was deleted.

content/academy/advanced_web_scraping/crawling/scraping_paginated_sites.md renamed to content/academy/advanced_web_scraping/scraping_paginated_sites.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,14 @@ title: Scraping paginated sites
33
description: Learn how to extract all of a website's listings even if they limit the number of results pages. See code examples for setting up your scraper.
44
menuWeight: 1
55
paths:
6-
- advanced-web-scraping/crawling/scraping-paginated-sites
6+
- advanced-web-scraping/scraping-paginated-sites
77
---
88

99
# Scraping websites with limited pagination
1010

1111
Limited pagination is a common practice on e-commerce sites and is becoming more popular over time. It makes sense: a real user will never want to look through more than 200 pages of results – only bots love unlimited pagination. Fortunately, there are ways to overcome this limit while keeping our code clean and generic.
1212

13-
![Pagination in on Google search results page]({{@asset advanced_web_scraping/crawling/images/pagination.webp}})
13+
![Pagination in on Google search results page]({{@asset advanced_web_scraping/images/pagination.webp}})
1414

1515
> In a rush? Skip the tutorial and get the [full code example](https://github.com/metalwarrior665/apify-utils/tree/master/examples/crawler-with-filters).
1616
@@ -52,7 +52,7 @@ This has several benefits:
5252

5353
In the previous section, we analyzed different options to split the pages to overcome the pagination limit. We have chosen range filters as the most reliable way to do that. In this section, we will discuss a generic algorithm to work with ranges, look at a few special cases and then write an example crawler.
5454

55-
![An example of range filters on a website]({{@asset advanced_web_scraping/crawling/images/pagination-filters.webp}})
55+
![An example of range filters on a website]({{@asset advanced_web_scraping/images/pagination-filters.webp}})
5656

5757
### [](#the-algorithm) The algorithm
5858

0 commit comments

Comments
 (0)