elastic
diff --git a/‎README.md‎
Lines changed: 3 additions & 1 deletion b/‎README.md‎
Lines changed: 3 additions & 1 deletion
diff --git a/‎notebooks/langchain/README.md‎
Lines changed: 0 additions & 26 deletions b/‎notebooks/langchain/README.md‎
Lines changed: 0 additions & 26 deletions
diff --git a/‎notebooks/langchain/self-query-retriever-examples/chatbot-example.ipynb‎
Lines changed: 289 additions & 0 deletions b/‎notebooks/langchain/self-query-retriever-examples/chatbot-example.ipynb‎
Lines changed: 289 additions & 0 deletions
@@ -30,7 +30,9 @@ The [`notebooks`](notebooks/README.md) folder contains a range of executable Pyt
 ### LangChain
 
 - [`question-answering.ipynb`](./notebooks/generative-ai/question-answering.ipynb)
-- [`langchain-self-query-retriever.ipynb`](./notebooks/langchain/langchain-self-query-retriever.ipynb)
+- [`langchain-self-query-retriever.ipynb`](./notebooks/langchain/self-query-retriever-examples/langchain-self-query-retriever.ipynb)
+- [`Question Answering with Self Query Retriever`](./notebooks/langchain/self-query-retriever-examples/chatbot-example.ipynb)
+- [`BM25 and Self-querying retriever with elasticsearch and LangChain`](./notebooks/langchain/self-query-retriever-examples/chatbot-with-bm25-only-example.ipynb)
 - [`langchain-vector-store.ipynb`](./notebooks/langchain/langchain-vector-store.ipynb)
 - [`langchain-vector-store-using-elser.ipynb`](./notebooks/langchain/langchain-vector-store-using-elser.ipynb)
 - [`langchain-using-own-model.ipynb`](./notebooks/langchain/langchain-using-own-model.ipynb)
 
@@ -0,0 +1,289 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Chatbot Example with Self Query Retriever\n",
+    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/langchain/self-query-retriever-examples/chatbot-example.ipynb)\n",
+    "\n",
+    "This workbook demonstrates example of Elasticsearch's [Self-query retriever](https://api.python.langchain.com/en/latest/retrievers/langchain.retrievers.self_query.base.SelfQueryRetriever.html) to convert a question into a structured query and apply structured query to Elasticsearch index. \n",
+    "\n",
+    "Before we begin, we first split the documents into chunks with `langchain` and then using [`ElasticsearchStore.from_documents`](https://api.python.langchain.com/en/latest/vectorstores/langchain.vectorstores.elasticsearch.ElasticsearchStore.html#langchain.vectorstores.elasticsearch.ElasticsearchStore.from_documents), we create a `vectorstore` and index data to elasticsearch.\n",
+    "\n",
+    "\n",
+    "We will then see few examples query demonstrating full power of elasticsearch powered self-query retriever.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Install packages and import modules\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 30,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.2\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m23.3.1\u001b[0m\n",
+      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n"
+     ]
+    }
+   ],
+   "source": [
+    "!python3 -m pip install -qU lark elasticsearch langchain openai\n",
+    "\n",
+    "from langchain.schema import Document\n",
+    "from langchain.embeddings.openai import OpenAIEmbeddings\n",
+    "from langchain.vectorstores import ElasticsearchStore\n",
+    "from langchain.llms import OpenAI\n",
+    "from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
+    "from langchain.chains.query_constructor.base import AttributeInfo\n",
+    "from getpass import getpass"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Create documents \n",
+    "Next, we will create list of documents with summary of movies using [langchain Schema Document](https://api.python.langchain.com/en/latest/schema/langchain.schema.document.Document.html), containing each document's `page_content` and `metadata` .\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 67,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "docs = [\n",
+    "    Document(\n",
+    "        page_content=\"A bunch of scientists bring back dinosaurs and mayhem breaks loose\",\n",
+    "        metadata={\"year\": 1993, \"rating\": 7.7, \"genre\": \"science fiction\", \"director\": \"Steven Spielberg\", \"title\": \"Jurassic Park\"},\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"Leo DiCaprio gets lost in a dream within a dream within a dream within a ...\",\n",
+    "        metadata={\"year\": 2010, \"director\": \"Christopher Nolan\", \"rating\": 8.2, \"title\": \"Inception\"},\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea\",\n",
+    "        metadata={\"year\": 2006, \"director\": \"Satoshi Kon\", \"rating\": 8.6, \"title\": \"Paprika\"},\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"A bunch of normal-sized women are supremely wholesome and some men pine after them\",\n",
+    "        metadata={\"year\": 2019, \"director\": \"Greta Gerwig\", \"rating\": 8.3, \"title\": \"Little Women\"},\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"Toys come alive and have a blast doing so\",\n",
+    "        metadata={\"year\": 1995, \"genre\": \"animated\", \"director\": \"John Lasseter\", \"rating\": 8.3, \"title\": \"Toy Story\"},\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"Three men walk into the Zone, three men walk out of the Zone\",\n",
+    "        metadata={\n",
+    "            \"year\": 1979,\n",
+    "            \"rating\": 9.9,\n",
+    "            \"director\": \"Andrei Tarkovsky\",\n",
+    "            \"genre\": \"science fiction\",\n",
+    "            \"rating\": 9.9,\n",
+    "            \"title\": \"Stalker\",\n",
+    "        },\n",
+    "    ),\n",
+    "]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Connect to Elasticsearch\n",
+    "\n",
+    "ℹ️ We're using an Elastic Cloud deployment of Elasticsearch for this notebook. If you don't have an Elastic Cloud deployment, sign up [here](https://cloud.elastic.co/registration?utm_source=github&utm_content=elasticsearch-labs-notebook) for a free trial. \n",
+    "\n",
+    "We'll use the **Cloud ID** to identify our deployment, because we are using Elastic Cloud deployment. To find the Cloud ID for your deployment, go to https://cloud.elastic.co/deployments and select your deployment.\n",
+    "\n",
+    "\n",
+    "We will use [ElasticsearchStore](https://api.python.langchain.com/en/latest/vectorstores/langchain.vectorstores.elasticsearch.ElasticsearchStore.html) to connect to our elastic cloud deployment, This would help create and index data easily.  We would also send list of documents that we created in the previous step."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 68,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud#finding-your-cloud-id\n",
+    "ELASTIC_CLOUD_ID = getpass(\"Elastic Cloud ID: \")\n",
+    "\n",
+    "# https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud#creating-an-api-key\n",
+    "ELASTIC_API_KEY = getpass(\"Elastic Api Key: \")\n",
+    "\n",
+    "# https://platform.openai.com/api-keys\n",
+    "OPENAI_API_KEY = getpass(\"OpenAI API key: \")\n",
+    "\n",
+    "embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY)\n",
+    "\n",
+    "\n",
+    "vectorstore = ElasticsearchStore.from_documents(\n",
+    "    docs, \n",
+    "    embeddings, \n",
+    "    index_name=\"elasticsearch-self-query-demo\", \n",
+    "    es_cloud_id=ELASTIC_CLOUD_ID, \n",
+    "    es_api_key=ELASTIC_API_KEY\n",
+    ")\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Setup query retriever\n",
+    "\n",
+    "Next we will instantiate self-query retriever by providing a bit information about our document attributes and a short description about the document. \n",
+    "\n",
+    "We will then instantiate retriever with [SelfQueryRetriever.from_llm](https://api.python.langchain.com/en/latest/retrievers/langchain.retrievers.self_query.base.SelfQueryRetriever.html)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 80,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Add details about metadata fields\n",
+    "metadata_field_info = [\n",
+    "    AttributeInfo(\n",
+    "        name=\"genre\",\n",
+    "        description=\"The genre of the movie. Can be either 'science fiction' or 'animated'.\",\n",
+    "        type=\"string or list[string]\",\n",
+    "    ),\n",
+    "    AttributeInfo(\n",
+    "        name=\"year\",\n",
+    "        description=\"The year the movie was released\",\n",
+    "        type=\"integer\",\n",
+    "    ),\n",
+    "    AttributeInfo(\n",
+    "        name=\"director\",\n",
+    "        description=\"The name of the movie director\",\n",
+    "        type=\"string\",\n",
+    "    ),\n",
+    "    AttributeInfo(\n",
+    "        name=\"rating\", description=\"A 1-10 rating for the movie\", type=\"float\"\n",
+    "    ),\n",
+    "]\n",
+    "\n",
+    "document_content_description = \"Brief summary of a movie\"\n",
+    "\n",
+    "# Set up openAI llm with sampling temperature 0\n",
+    "llm = OpenAI(temperature=0, openai_api_key=OPENAI_API_KEY)\n",
+    "\n",
+    "# instantiate retriever\n",
+    "retriever = SelfQueryRetriever.from_llm(\n",
+    "    llm, vectorstore, document_content_description, metadata_field_info, verbose=True\n",
+    ")\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Question Answering with Self-Query Retriever\n",
+    "\n",
+    "We will now demonstrate how to use self-query retriever for RAG."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 77,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content='Inception (2010)')"
+      ]
+     },
+     "execution_count": 77,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain.chat_models import ChatOpenAI\n",
+    "from langchain.schema.runnable import RunnableParallel, RunnablePassthrough\n",
+    "from langchain.prompts import ChatPromptTemplate, PromptTemplate\n",
+    "from langchain.schema import format_document\n",
+    "\n",
+    "LLM_CONTEXT_PROMPT = ChatPromptTemplate.from_template(\"\"\"\n",
+    "Use the following context movies that matched the user question. Use the movies below only to answer the user's question.\n",
+    "\n",
+    "If you don't know the answer, just say that you don't know, don't try to make up an answer.\n",
+    "\n",
+    "----\n",
+    "{context}\n",
+    "----\n",
+    "Question: {question}\n",
+    "Answer:\n",
+    "\"\"\")\n",
+    "\n",
+    "DOCUMENT_PROMPT = PromptTemplate.from_template(\"\"\"\n",
+    "---\n",
+    "title: {title}                                                                                   \n",
+    "year: {year}  \n",
+    "director: {director}    \n",
+    "---\n",
+    "\"\"\")\n",
+    "\n",
+    "def _combine_documents(\n",
+    "    docs, document_prompt=DOCUMENT_PROMPT, document_separator=\"\\n\\n\"\n",
+    "):\n",
+    "    doc_strings = [format_document(doc, document_prompt) for doc in docs]\n",
+    "    return document_separator.join(doc_strings)\n",
+    "\n",
+    "\n",
+    "_context = RunnableParallel(\n",
+    "    context=retriever | _combine_documents,\n",
+    "    question=RunnablePassthrough(),\n",
+    ")\n",
+    "\n",
+    "chain = (_context | LLM_CONTEXT_PROMPT | llm)\n",
+    "\n",
+    "chain.invoke(\"What movies are about dreams and was released after the year 1992 but before 2007?\")"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3.11.4 64-bit",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.3"
+  },
+  "orig_nbformat": 4,
+  "vscode": {
+   "interpreter": {
+    "hash": "b0fa6594d8f4cbf19f97940f81e996739fb7646882a419484c72d19e05852a7e"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}