Skip to content

Commit 576b935

Browse files
committed
Updates docs and examples
1 parent 2f76999 commit 576b935

File tree

6 files changed

+328
-20
lines changed

6 files changed

+328
-20
lines changed

docs/tools.ipynb

Lines changed: 59 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -5,17 +5,35 @@
55
"id": "d3a12ba8",
66
"metadata": {},
77
"source": [
8-
"# LangChain – ScraperAPI Integration\n",
8+
"# LangChain – ScraperAPI\n",
99
"\n",
10-
"`langchain-scraperapi` adds **three ready-to-use LangChain tools** backed by the ScraperAPI service:\n",
10+
"Give your AI agent the ability to browse websites, search Google and Amazon in just two lines of code.\n",
1111
"\n",
12-
"| Tool class | Endpoint type | Typical use-case |\n",
13-
"|------------|---------------|------------------|\n",
14-
"| `ScraperAPITool` | **Raw** `/` | Grab the HTML (or text/markdown) of _any_ web page |\n",
15-
"| `ScraperAPIGoogleSearchTool` | **Structured** `/structured/google/search` | Get structured Google Search SERP data |\n",
16-
"| `ScraperAPIAmazonSearchTool` | **Structured** `/structured/amazon/search` | Get structured Amazon product-search data |\n",
12+
"The `langchain-scraperapi` package adds three ready-to-use LangChain tools backed by the [ScraperAPI](https://www.scraperapi.com/) service:\n",
1713
"\n",
18-
"The Python implementation lives in the **`langchain-scraperapi`** package and is shown in the attached source files :contentReference[oaicite:0]{index=0}."
14+
"| Tool class | Use it to |\n",
15+
"|------------|------------------|\n",
16+
"| `ScraperAPITool` | Grab the HTML/text/markdown of any web page |\n",
17+
"| `ScraperAPIGoogleSearchTool` | Get structured Google Search SERP data |\n",
18+
"| `ScraperAPIAmazonSearchTool` | Get structured Amazon product-search data |\n",
19+
"\n",
20+
"## Overview\n",
21+
"\n",
22+
"### Integration details\n",
23+
"\n",
24+
"| Package | Serializable | [JS support](https://js.langchain.com/docs/integrations/tools/__module_name__) | Package latest |\n",
25+
"| :--- | :---: | :---: | :---: |\n",
26+
"| [langchain-scraperapi](https://pypi.org/project/langchain-scraperapi/) | ❌ | ❌ | v0.1.0 |"
27+
]
28+
},
29+
{
30+
"cell_type": "markdown",
31+
"id": "d1f7c70f",
32+
"metadata": {},
33+
"source": [
34+
"### Setup\n",
35+
"\n",
36+
"Install the `langchain-scraperapi` package."
1937
]
2038
},
2139
{
@@ -25,19 +43,38 @@
2543
"metadata": {},
2644
"outputs": [],
2745
"source": [
28-
"# Installation and credentials\n",
29-
"# ------------------------------------------------------------------\n",
30-
"%pip install -U langchain-scraperapi # uncomment if needed\n",
46+
"%pip install -U langchain-scraperapi"
47+
]
48+
},
49+
{
50+
"cell_type": "markdown",
51+
"id": "c111d2fb",
52+
"metadata": {},
53+
"source": [
54+
"### Credentials\n",
55+
"\n",
56+
"Create an account at https://www.scraperapi.com/ and get an API key."
57+
]
58+
},
59+
{
60+
"cell_type": "code",
61+
"execution_count": null,
62+
"id": "4d315465",
63+
"metadata": {},
64+
"outputs": [],
65+
"source": [
3166
"import os\n",
32-
"os.environ[\"SCRAPERAPI_API_KEY\"] = \"your-api-key\" # obtain from https://www.scraperapi.com/"
67+
"os.environ[\"SCRAPERAPI_API_KEY\"] = \"your-api-key\""
3368
]
3469
},
3570
{
3671
"cell_type": "markdown",
3772
"id": "051ef7b1",
3873
"metadata": {},
3974
"source": [
40-
"## 1 `ScraperAPITool` — raw page scraping\n",
75+
"## Features\n",
76+
"\n",
77+
"### 1. `ScraperAPITool` — browse any website\n",
4178
"\n",
4279
"Invoke the *raw* ScraperAPI endpoint and get HTML, rendered DOM, text, or markdown.\n",
4380
"\n",
@@ -93,7 +130,7 @@
93130
"id": "9f2947dd",
94131
"metadata": {},
95132
"source": [
96-
"## 2 `ScraperAPIGoogleSearchTool` — structured Google Search\n",
133+
"### 2. `ScraperAPIGoogleSearchTool` — structured Google Search\n",
97134
"\n",
98135
"Structured SERP data via `/structured/google/search`.\n",
99136
"\n",
@@ -130,7 +167,7 @@
130167
"id": "3dc2f845",
131168
"metadata": {},
132169
"source": [
133-
"## 3 `ScraperAPIAmazonSearchTool` — structured Amazon Search\n",
170+
"### 3. `ScraperAPIAmazonSearchTool` — structured Amazon Search\n",
134171
"\n",
135172
"Structured product results via `/structured/amazon/search`.\n",
136173
"\n",
@@ -167,9 +204,9 @@
167204
"id": "607eb8c8",
168205
"metadata": {},
169206
"source": [
170-
"## Using the tools in an AI agent.\n",
207+
"### Example: Make an AI agent that can browse the web\n",
171208
"\n",
172-
"Here is an example of using the tools in an agent. This gives the AI the ability to browse any website, summarize articles, and click on links to navigate between pages."
209+
"Here is an example of using the tools in an AI agent. The `ScraperAPITool` gives the AI the ability to browse any website, summarize articles, and click on links to navigate between pages."
173210
]
174211
},
175212
{
@@ -210,12 +247,14 @@
210247
"source": [
211248
"## Further reading\n",
212249
"\n",
213-
"* **ScraperAPI – request customisation** \n",
250+
"Below you can find more information on additional parameters to the tools to customize your requests.\n",
251+
"\n",
252+
"* `ScraperAPITool`\n",
214253
" <https://docs.scraperapi.com/python/making-requests/customizing-requests> \n",
215-
"* **ScraperAPI – structured endpoints** \n",
254+
"* `ScraperAPIGoogleSearchTool` and `ScraperAPIAmazonSearchTool`\n",
216255
" <https://docs.scraperapi.com/python/making-requests/structured-data-collection-method> \n",
217256
"\n",
218-
"The LangChain wrappers surface these parameters directly, so consult the ScraperAPI docs for edge-case options and rate-limit details."
257+
"The LangChain wrappers surface these parameters directly, so consult the ScraperAPI docs for details."
219258
]
220259
}
221260
],

examples/.env.example

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
SCRAPERAPI_API_KEY=xxx
2+
OPENAI_API_KEY=yyy

examples/README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
Included are some examples of agents that can be built using this package.
2+
3+
First, create a .`env` file with `SCRAPERAPI_API_KEY` and `OPENAI_API_KEY`.
4+
5+
The web browsing and Amazon search agents use Streamlit to create a Chatbot interface. To run them, first `pip install streamlit` and then launch using `streamlit run web_browsing_agent.py`.

examples/amazon_search_agent.py

Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
import os
2+
import dotenv
3+
import streamlit as st
4+
import json
5+
6+
from langchain_openai import ChatOpenAI
7+
from langchain.agents import AgentExecutor, create_tool_calling_agent
8+
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
9+
from langchain_core.messages import AIMessage
10+
from langchain_community.chat_message_histories import StreamlitChatMessageHistory
11+
from langchain_scraperapi.tools import ScraperAPIAmazonSearchTool
12+
13+
dotenv.load_dotenv()
14+
15+
amazon_search_tool = ScraperAPIAmazonSearchTool()
16+
17+
tools = [amazon_search_tool]
18+
19+
llm = ChatOpenAI(model_name="gpt-4.1", temperature=0)
20+
21+
prompt = ChatPromptTemplate.from_messages(
22+
[
23+
("system",
24+
f"You are a helpful assistant that searches for products on Amazon. "
25+
f"When asked to search Amazon, use the '{amazon_search_tool.name}' tool. "
26+
"The tool requires a 'query' parameter for the search term. "
27+
"It can also optionally take a 'page' parameter for the page number (defaults to 1). "
28+
"Please extract the search query and page number (if the user specifies one) from the user's request."),
29+
MessagesPlaceholder(variable_name="chat_history", optional=True),
30+
("human", "{input}"),
31+
MessagesPlaceholder(variable_name="agent_scratchpad"),
32+
]
33+
)
34+
35+
agent = create_tool_calling_agent(llm, tools, prompt)
36+
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
37+
38+
st.set_page_config(page_title="📦 Amazon Search Agent", page_icon="🛒")
39+
40+
st.markdown("""
41+
<style>
42+
img {
43+
max-width: 500px !important;
44+
height: auto;
45+
}
46+
</style>
47+
""", unsafe_allow_html=True)
48+
49+
st.title("📦 Amazon Search Agent")
50+
st.caption("I can search Amazon for you! Just tell me what you're looking for.")
51+
52+
def render_tool_expander(intermediate_steps):
53+
if intermediate_steps:
54+
with st.expander("🔎 Tool Interaction Details", expanded=False):
55+
for step in intermediate_steps:
56+
action = step[0]
57+
observation = step[1]
58+
59+
st.markdown("**Tool Called:**")
60+
st.text(action.tool)
61+
st.markdown("**Tool Input:**")
62+
if isinstance(action.tool_input, dict):
63+
st.json(action.tool_input)
64+
else:
65+
st.text(action.tool_input)
66+
67+
st.markdown("**Tool Output (Observation):**")
68+
try:
69+
if isinstance(observation, (dict, list)):
70+
st.json(observation)
71+
elif isinstance(observation, str):
72+
parsed_json = json.loads(observation)
73+
st.json(parsed_json)
74+
else:
75+
st.text(str(observation))
76+
except json.JSONDecodeError:
77+
st.text(str(observation))
78+
except Exception:
79+
st.text(str(observation))
80+
81+
82+
msgs = StreamlitChatMessageHistory(key="amazon_search_messages")
83+
if len(msgs.messages) == 0:
84+
msgs.add_ai_message("Hello! How can I help you search Amazon today?")
85+
86+
for msg in msgs.messages:
87+
st.chat_message(msg.type).write(msg.content)
88+
if msg.type == "ai" and "intermediate_steps" in msg.additional_kwargs:
89+
render_tool_expander(msg.additional_kwargs["intermediate_steps"])
90+
91+
92+
if user_query := st.chat_input("What should I search for on Amazon?"):
93+
msgs.add_user_message(user_query)
94+
st.chat_message("user").write(user_query)
95+
96+
agent_input = {"input": user_query, "chat_history": msgs.messages[:-1]}
97+
98+
with st.chat_message("ai"):
99+
final_response_content = ""
100+
intermediate_steps = []
101+
102+
try:
103+
with st.spinner("Agent is working..."):
104+
agent_executor_with_steps = AgentExecutor(
105+
agent=agent,
106+
tools=tools,
107+
verbose=True,
108+
return_intermediate_steps=True
109+
)
110+
response_with_steps = agent_executor_with_steps.invoke(agent_input)
111+
final_response_content = response_with_steps.get("output", "Sorry, I couldn't process that request.")
112+
intermediate_steps = response_with_steps.get("intermediate_steps", [])
113+
114+
render_tool_expander(intermediate_steps)
115+
116+
except Exception as e:
117+
st.error(f"An error occurred: {e}")
118+
final_response_content = "I encountered an error trying to process your request."
119+
with st.expander("🔎 Tool Interaction Details", expanded=True):
120+
st.error(f"Error during agent execution: {e}")
121+
122+
st.write(final_response_content)
123+
124+
ai_message_to_add = AIMessage(
125+
content=final_response_content,
126+
additional_kwargs={"intermediate_steps": intermediate_steps}
127+
)
128+
msgs.add_message(ai_message_to_add)

examples/basic_agent.py

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
import dotenv
2+
from langchain_openai import ChatOpenAI
3+
from langchain.agents import AgentExecutor, create_tool_calling_agent
4+
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
5+
from langchain_scraperapi.tools import ScraperAPIAmazonSearchTool
6+
7+
dotenv.load_dotenv()
8+
9+
tools = [ScraperAPIAmazonSearchTool(output_format="json")]
10+
llm = ChatOpenAI(model_name="gpt-4o", temperature=0)
11+
12+
prompt = ChatPromptTemplate.from_messages(
13+
[
14+
("system", "You are a helpful assistant that can browse Amazon for users. When asked to browse the website or look for a product, use the ScraperAPIAmazonSearchTool."),
15+
("human", "{input}"),
16+
MessagesPlaceholder(variable_name="agent_scratchpad"),
17+
]
18+
)
19+
20+
agent = create_tool_calling_agent(llm, tools, prompt)
21+
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
22+
response = agent_executor.invoke({
23+
"input": "can you give me the cheapest mother's day gifts"
24+
})
25+
print(response)

0 commit comments

Comments
 (0)