Skip to content

Commit c3db3c7

Browse files
committed
Cleanup docs and readme
1 parent 576b935 commit c3db3c7

File tree

2 files changed

+113
-32
lines changed

2 files changed

+113
-32
lines changed

README.md

Lines changed: 102 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,48 +1,131 @@
1-
# Langchain ScraperAPI Integration
1+
# LangChain – ScraperAPI
22

3-
This package contains the LangChain integration with ScraperAPI.
3+
Give your AI agent the ability to browse websites, search Google and Amazon in just two lines of code.
4+
5+
The `langchain-scraperapi` package adds three ready-to-use LangChain tools backed by the [ScraperAPI](https://www.scraperapi.com/) service:
6+
7+
| Tool class | Use it to |
8+
|------------|----------|
9+
| `ScraperAPITool` | Grab the HTML/text/markdown of any web page |
10+
| `ScraperAPIGoogleSearchTool` | Get structured Google Search SERP data |
11+
| `ScraperAPIAmazonSearchTool` | Get structured Amazon product-search data |
412

513
## Installation
614

715
```bash
816
pip install -U langchain-scraperapi
917
```
1018

11-
And you should configure credentials by setting the environment variable `SCRAPERAPI_API_KEY`.
19+
## Setup
20+
21+
Create an account at https://www.scraperapi.com/ and get an API key, then set it as an environment variable:
1222

13-
## Tools
23+
```python
24+
import os
25+
os.environ["SCRAPERAPI_API_KEY"] = "your-api-key"
26+
```
1427

15-
### ScraperAPITool
28+
## Quick Start
1629

17-
`ScraperAPITool` exposes the web scraping tool from ScraperAPI.
30+
### ScraperAPITool — Browse any website
31+
32+
Scrape HTML, text, or markdown from any webpage:
1833

1934
```python
20-
from langchain_scraperapi import ScraperAPITool
35+
from langchain_scraperapi.tools import ScraperAPITool
2136

2237
tool = ScraperAPITool()
23-
tool.invoke("url: http://example.com", "output_format": "markdown")
38+
39+
# Get text content
40+
result = tool.invoke({
41+
"url": "https://example.com",
42+
"output_format": "text",
43+
"render": True
44+
})
45+
print(result)
2446
```
2547

26-
### ScraperAPIGoogleSearchTool
48+
**Parameters:**
49+
- `url` (required) – target page URL
50+
- `output_format``"text"` | `"markdown"` (default returns HTML)
51+
- `country_code` – e.g. `"us"`, `"de"`
52+
- `device_type``"desktop"` | `"mobile"`
53+
- `premium` – use premium proxies
54+
- `render` – run JavaScript before returning content
55+
- `keep_headers` – include response headers
2756

28-
`ScraperAPIGoogleSearchTool` allows the scraping of Google search results in `json` or `csv` format.
57+
### ScraperAPIGoogleSearchTool — Structured Google Search
58+
59+
Get structured Google Search results:
2960

3061
```python
31-
from langchain_scraperapi import ScraperAPIGoogleSearchTool
62+
from langchain_scraperapi.tools import ScraperAPIGoogleSearchTool
63+
64+
google_search = ScraperAPIGoogleSearchTool()
3265

33-
tool = ScraperAPIGoogleSearchTool()
34-
tool.invoke("query": "What is ScraperAPI?")
66+
results = google_search.invoke({
67+
"query": "what is langchain",
68+
"num": 20,
69+
"output_format": "json"
70+
})
71+
print(results)
3572
```
3673

37-
### ScraperAPIAmazonSearchTool
74+
**Parameters:**
75+
- `query` (required) – search terms
76+
- `output_format``"json"` (default) or `"csv"`
77+
- `country_code`, `tld`, `num`, `hl`, `gl` – optional search modifiers
3878

39-
`ScraperAPIAmazonSearchTool` allows the scraping of Amazon search results in `json` or `csv` format.
79+
### ScraperAPIAmazonSearchTool — Structured Amazon Search
80+
81+
Get structured Amazon product search results:
4082

4183
```python
42-
from langchain_scraperapi import ScraperAPIAmazonSearchTool
84+
from langchain_scraperapi.tools import ScraperAPIAmazonSearchTool
85+
86+
amazon_search = ScraperAPIAmazonSearchTool()
87+
88+
products = amazon_search.invoke({
89+
"query": "noise cancelling headphones",
90+
"tld": "co.uk",
91+
"page": 2
92+
})
93+
print(products)
94+
```
4395

44-
tool = ScraperAPIAmazonSearchTool()
45-
tool.invoke("query": "office chairs", "output_format": "csv")
96+
**Parameters:**
97+
- `query` (required) – product search terms
98+
- `output_format``"json"` (default) or `"csv"`
99+
- `country_code`, `tld`, `page` – optional search modifiers
100+
101+
## Example: AI Agent that can browse the web
102+
103+
```python
104+
from langchain_openai import ChatOpenAI
105+
from langchain.agents import AgentExecutor, create_tool_calling_agent
106+
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
107+
from langchain_scraperapi.tools import ScraperAPITool
108+
109+
# Set up tools and LLM
110+
tools = [ScraperAPITool()]
111+
llm = ChatOpenAI(model_name="gpt-4o", temperature=0)
112+
113+
# Create prompt
114+
prompt = ChatPromptTemplate.from_messages([
115+
("system", "You are a helpful assistant that can browse websites. Use ScraperAPITool to access web content."),
116+
("human", "{input}"),
117+
MessagesPlaceholder(variable_name="agent_scratchpad"),
118+
])
119+
120+
# Create and run agent
121+
agent = create_tool_calling_agent(llm, tools, prompt)
122+
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
123+
124+
response = agent_executor.invoke({
125+
"input": "Browse hackernews and summarize the top story"
126+
})
46127
```
47128

48-
For a full list of parameters and more information, refer to the ScraperAPI Python docs: https://docs.scraperapi.com/python/making-requests/structured-data-collection-method
129+
## Documentation
130+
131+
For complete parameter details and advanced usage, see the [ScraperAPI documentation](https://docs.scraperapi.com/python/making-requests/customizing-requests).

docs/tools.ipynb

Lines changed: 11 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@
8989
" * `render`: `bool` – run JS before returning HTML \n",
9090
" * `keep_headers`: `bool` – include response headers \n",
9191
" \n",
92-
"For the complete set of modifiers see the ScraperAPI request-customisation docs."
92+
"For the complete set of modifiers see the [ScraperAPI request-customisation docs](https://docs.scraperapi.com/python/making-requests/customizing-requests)\n"
9393
]
9494
},
9595
{
@@ -99,7 +99,7 @@
9999
"metadata": {},
100100
"outputs": [],
101101
"source": [
102-
"from langchain_scraperapi import ScraperAPITool\n",
102+
"from langchain_scraperapi.tools import ScraperAPITool\n",
103103
"\n",
104104
"# Instantiate\n",
105105
"tool = ScraperAPITool()\n",
@@ -148,18 +148,18 @@
148148
"metadata": {},
149149
"outputs": [],
150150
"source": [
151-
"from langchain_scraperapi import ScraperAPIGoogleSearchTool\n",
151+
"from langchain_scraperapi.tools import ScraperAPIGoogleSearchTool\n",
152152
"\n",
153153
"google_search = ScraperAPIGoogleSearchTool()\n",
154154
"\n",
155-
"serp_json = google_search.invoke(\n",
155+
"results = google_search.invoke(\n",
156156
" {\n",
157-
" \"query\": \"site:langchain.com vector store tutorial\",\n",
157+
" \"query\": \"what is langchain\",\n",
158158
" \"num\": 20,\n",
159159
" \"output_format\": \"json\",\n",
160160
" }\n",
161161
")\n",
162-
"print(serp_json.keys())"
162+
"print(results)"
163163
]
164164
},
165165
{
@@ -185,7 +185,7 @@
185185
"metadata": {},
186186
"outputs": [],
187187
"source": [
188-
"from langchain_scraperapi import ScraperAPIAmazonSearchTool\n",
188+
"from langchain_scraperapi.tools import ScraperAPIAmazonSearchTool\n",
189189
"\n",
190190
"amazon_search = ScraperAPIAmazonSearchTool()\n",
191191
"\n",
@@ -196,7 +196,7 @@
196196
" \"page\": 2,\n",
197197
" }\n",
198198
")\n",
199-
"print(products[:1])"
199+
"print(products)"
200200
]
201201
},
202202
{
@@ -249,12 +249,10 @@
249249
"\n",
250250
"Below you can find more information on additional parameters to the tools to customize your requests.\n",
251251
"\n",
252-
"* `ScraperAPITool`\n",
253-
" <https://docs.scraperapi.com/python/making-requests/customizing-requests> \n",
254-
"* `ScraperAPIGoogleSearchTool` and `ScraperAPIAmazonSearchTool`\n",
255-
" <https://docs.scraperapi.com/python/making-requests/structured-data-collection-method> \n",
252+
"* [ScraperAPITool](https://docs.scraperapi.com/python/making-requests/customizing-requests)\n",
253+
"* [ScraperAPIGoogleSearchTool and ScraperAPIAmazonSearchTool](https://docs.scraperapi.com/python/making-requests/structured-data-collection-method)\n",
256254
"\n",
257-
"The LangChain wrappers surface these parameters directly, so consult the ScraperAPI docs for details."
255+
"The LangChain wrappers surface these parameters directly."
258256
]
259257
}
260258
],

0 commit comments

Comments
 (0)