GFarnon
diff --git a/‎README.md‎
Lines changed: 10 additions & 3 deletions b/‎README.md‎
Lines changed: 10 additions & 3 deletions
diff --git a/‎docs/api/retrieval_model_clients/FaissRM.md‎
Lines changed: 2 additions & 2 deletions b/‎docs/api/retrieval_model_clients/FaissRM.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/docs/building-blocks/1-language_models.md‎
Lines changed: 4 additions & 4 deletions b/‎docs/docs/building-blocks/1-language_models.md‎
Lines changed: 4 additions & 4 deletions
diff --git a/‎docs/docs/cheatsheet.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/docs/cheatsheet.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/docs/deep-dive/retrieval_models_clients/Azure.mdx‎
Lines changed: 65 additions & 5 deletions b/‎docs/docs/deep-dive/retrieval_models_clients/Azure.mdx‎
Lines changed: 65 additions & 5 deletions
diff --git a/‎docs/docs/deep-dive/teleprompter/signature-optimizer.mdx‎
Lines changed: 1 addition & 1 deletion b/‎docs/docs/deep-dive/teleprompter/signature-optimizer.mdx‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/docs/faqs.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/docs/faqs.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/docs/quick-start/minimal-example.mdx‎
Lines changed: 11 additions & 3 deletions b/‎docs/docs/quick-start/minimal-example.mdx‎
Lines changed: 11 additions & 3 deletions
diff --git a/‎docs/docs/tutorials/other_tutorial.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/docs/tutorials/other_tutorial.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/docs/tutorials/rag.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/docs/tutorials/rag.md‎
Lines changed: 1 addition & 1 deletion
@@ -58,10 +58,16 @@ Ditto! **DSPy** gives you the right general-purpose modules (e.g., `ChainOfThoug
 
 All you need is:
 
-```
+```bash
 pip install dspy-ai
 ```
 
+To install the very latest from `main`:
+
+```bash
+pip install git+https://github.com/stanfordnlp/dspy.git
+````
+
 Or open our intro notebook in Google Colab: [<img align="center" src="https://colab.research.google.com/assets/colab-badge.svg" />](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/intro.ipynb)
 
 By default, DSPy installs the latest `openai` from pip. However, if you install old version before OpenAI changed their API `openai~=0.28.1`, the library will use that just fine. Both are supported.
@@ -95,7 +101,7 @@ The DSPy documentation is divided into **tutorials** (step-by-step illustration
 
 - [DSPy talk at ScaleByTheBay Nov 2023](https://www.youtube.com/watch?v=Dt3H2ninoeY).
 - [DSPy webinar with MLOps Learners](https://www.youtube.com/watch?v=im7bCLW2aM4), a bit longer with Q&A.
-- Hands-on Overviews of DSPy by the community: [DSPy Explained! by Connor Shorten](https://www.youtube.com/watch?v=41EfOY0Ldkc), [DSPy explained by code_your_own_ai](https://www.youtube.com/watch?v=ycfnKPxBMck)
+- Hands-on Overviews of DSPy by the community: [DSPy Explained! by Connor Shorten](https://www.youtube.com/watch?v=41EfOY0Ldkc), [DSPy explained by code_your_own_ai](https://www.youtube.com/watch?v=ycfnKPxBMck), [DSPy Crash Course by AI Bites](https://youtu.be/5-zgASQKkKQ?si=3gnmVouT5_rpk_nu)
 - Interviews: [Weaviate Podcast in-person](https://www.youtube.com/watch?v=CDung1LnLbY), and you can find 6-7 other remote podcasts on YouTube from a few different perspectives/audiences.
 - **Tracing in DSPy** with Arize Phoenix: [Tutorial for tracing your prompts and the steps of your DSPy programs](https://colab.research.google.com/github/Arize-ai/phoenix/blob/main/tutorials/tracing/dspy_tracing_tutorial.ipynb)
 - [DSPy: Not Your Average Prompt Engineering](https://jina.ai/news/dspy-not-your-average-prompt-engineering), why it's crucial for future prompt engineering, and yet why it is challenging for prompt engineers to learn.
@@ -142,6 +148,7 @@ You can find other examples tweeted by [@lateinteraction](https://twitter.com/la
 - [DSPy on BIG-Bench Hard Example, by Chris Levy](https://drchrislevy.github.io/posts/dspy/dspy.html)
 - [Using Ollama with DSPy for Mistral (quantized) by @jrknox1977](https://gist.github.com/jrknox1977/78c17e492b5a75ee5bbaf9673aee4641)
 - [Using DSPy, "The Unreasonable Effectiveness of Eccentric Automatic Prompts" (paper) by VMware's Rick Battle & Teja Gollapudi, and interview at TheRegister](https://www.theregister.com/2024/02/22/prompt_engineering_ai_models/)
+- [Optimizing Performance of Open Source LM for Text-to-SQL using DSPy and vLLM, by Juan Ovalle](https://github.com/jjovalle99/DSPy-Text2SQL)
 - Typed DSPy (contributed by [@normal-computing](https://github.com/normal-computing))
   - [Using DSPy to train Gpt 3.5 on HumanEval by Thomas Ahle](https://github.com/stanfordnlp/dspy/blob/main/examples/functional/functional.ipynb)
   - [Building a chess playing agent using DSPy by Franck SN](https://medium.com/thoughts-on-machine-learning/building-a-chess-playing-agent-using-dspy-9b87c868f71e)
@@ -414,7 +421,7 @@ See [CONTRIBUTING.md](CONTRIBUTING.md) for a quickstart guide to contributing to
 
 **DSPy** is led by **Omar Khattab** at Stanford NLP with **Chris Potts** and **Matei Zaharia**.
 
-Key contributors and team members include **Arnav Singhvi**, **Krista Opsahl-Ong**, **Michael Ryan**, **Karel D'Oosterlinck**, **Shangyin Tan**, **Manish Shetty**, **Paridhi Maheshwari**, **Keshav Santhanam**, **Sri Vardhamanan**, **Eric Zhang**, **Hanna Moazam**, **Thomas Joshi**, **Saiful Haq**, **Ashutosh Sharma**, and **Herumb Shandilya**.
+Key contributors and team members include **Arnav Singhvi**, **Krista Opsahl-Ong**, **Michael Ryan**, **Cyrus Nouroozi**, **Kyle Caverly**, **Amir Mehr**, **Karel D'Oosterlinck**, **Shangyin Tan**, **Manish Shetty**, **Herumb Shandilya**, **Paridhi Maheshwari**, **Keshav Santhanam**, **Sri Vardhamanan**, **Eric Zhang**, **Hanna Moazam**, **Thomas Joshi**, **Saiful Haq**, and **Ashutosh Sharma**.
 
 **DSPy** includes important contributions from **Rick Battle** and **Igor Kotenkov**. It reflects discussions with **Peter Zhong**, **Haoze He**, **Lisa Li**, **David Hall**, **Ashwin Paranjape**, **Heather Miller**, **Chris Manning**, **Percy Liang**, and many others.
 
 
@@ -40,7 +40,7 @@ The **FaissRM** module provides a retriever that uses an in-memory Faiss vector
 
 ```python
 import dspy
-from dspy.retrieve import FaissRM
+from dspy.retrieve.faiss_rm import FaissRM
 
 document_chunks = [
     "The superbowl this year was played between the San Francisco 49ers and the Kanasas City Chiefs",
@@ -59,4 +59,4 @@ frm = FaissRM(document_chunks)
 turbo = dspy.OpenAI(model="gpt-3.5-turbo")
 dspy.settings.configure(lm=turbo, rm=frm)
 print(frm(["I am in the mood for Chinese food"]))
-```
+```
@@ -145,25 +145,25 @@ You need to host these models on your own GPU(s). Below, we include pointers for
 1.  `dspy.HFClientTGI`: for HuggingFace models through the Text Generation Inference (TGI) system. [Tutorial: How do I install and launch the TGI server?](https://dspy-docs.vercel.app/docs/deep-dive/language_model_clients/local_models/HFClientTGI)
 
 ```python
-tgi_llama2 = dspy.HFClientTGI(model="meta-llama/Llama-2-7b-hf", port=8080, url="http://localhost")
+tgi_mistral = dspy.HFClientTGI(model="mistralai/Mistral-7B-Instruct-v0.2", port=8080, url="http://localhost")
 ```
 
 2.  `dspy.HFClientVLLM`: for HuggingFace models through vLLM. [Tutorial: How do I install and launch the vLLM server?](https://dspy-docs.vercel.app/docs/deep-dive/language_model_clients/local_models/HFClientVLLM)
 
 ```python
-vllm_llama2 = dspy.HFClientVLLM(model="meta-llama/Llama-2-7b-hf", port=8080, url="http://localhost")
+vllm_mistral = dspy.HFClientVLLM(model="mistralai/Mistral-7B-Instruct-v0.2", port=8080, url="http://localhost")
 ```
 
 3.  `dspy.HFModel` (experimental) [Tutorial: How do I initialize models using HFModel](https://dspy-docs.vercel.app/api/local_language_model_clients/HFModel)
 
 ```python
-llama = dspy.HFModel(model = 'meta-llama/Llama-2-7b-hf')
+mistral = dspy.HFModel(model = 'mistralai/Mistral-7B-Instruct-v0.2')
 ```
 
 4.  `dspy.Ollama` (experimental) for open source models through [Ollama](https://ollama.com). [Tutorial: How do I install and use Ollama on a local computer?](https://dspy-docs.vercel.app/api/local_language_model_clients/Ollama)\n",
 
 ```python
-mistral_ollama = dspy.OllamaLocal(model='mistral')
+ollama_mistral = dspy.OllamaLocal(model='mistral')
 ```
 
 5.  `dspy.ChatModuleClient` (experimental): [How do I install and use MLC?](https://dspy-docs.vercel.app/api/local_language_model_clients/MLC)
 
@@ -177,7 +177,7 @@ print(f"Question: {question}")
 print(f"Final Predicted Answer (after ReAct process): {result.answer}")
 ```
 
-### dspy.Retreive
+### dspy.Retrieve
 
 ```python
 colbertv2_wiki17_abstracts = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')
 
@@ -22,32 +22,59 @@ The constructor initializes an instance of the `AzureAISearchRM` class and sets
 - `search_api_key` (str): The API key for accessing the Azure AI Search service.
 - `search_index_name` (str): The name of the search index in the Azure AI Search service.
 - `field_text` (str): The name of the field containing text content in the search index. This field will be mapped to the "content" field in the dsp framework.
+- `field_vector` (Optional[str]): The name of the field containing vector content in the search index.
 - `k` (int, optional): The default number of top passages to retrieve. Defaults to 3.
+- `azure_openai_client` (Optional[openai.AzureOpenAI]): An instance of the AzureOpenAI client. Either openai_client or embedding_func must be provided. Defaults to None.
+- `openai_embed_model` (Optional[str]): The name of the OpenAI embedding model. Defaults to "text-embedding-ada-002".
+- `embedding_func` (Optional[Callable]): A function for generating embeddings. Either openai_client or embedding_func must be provided. Defaults to None.
 - `semantic_ranker` (bool, optional): Whether to use semantic ranking. Defaults to False.
 - `filter` (str, optional): Additional filter query. Defaults to None.
 - `query_language` (str, optional): The language of the query. Defaults to "en-Us".
 - `query_speller` (str, optional): The speller mode. Defaults to "lexicon".
 - `use_semantic_captions` (bool, optional): Whether to use semantic captions. Defaults to False.
 - `query_type` (Optional[QueryType], optional): The type of query. Defaults to QueryType.FULL.
 - `semantic_configuration_name` (str, optional): The name of the semantic configuration. Defaults to None.
+- `is_vector_search` (Optional[bool]): Whether to enable vector search. Defaults to False.
+- `is_hybrid_search` (Optional[bool]): Whether to enable hybrid search. Defaults to False.
+- `is_fulltext_search` (Optional[bool]): Whether to enable fulltext search. Defaults to True.
+- `vector_filter_mode` (Optional[VectorFilterMode]): The vector filter mode. Defaults to None.
 
-Available Query Types:
 
-    SIMPLE
+**Available Query Types:**
+
+- SIMPLE
     """Uses the simple query syntax for searches. Search text is interpreted using a simple query
     #: language that allows for symbols such as +, * and "". Queries are evaluated across all
     #: searchable fields by default, unless the searchFields parameter is specified."""
-    FULL
+- FULL
     """Uses the full Lucene query syntax for searches. Search text is interpreted using the Lucene
     #: query language which allows field-specific and weighted searches, as well as other advanced
     #: features."""
-    SEMANTIC
+- SEMANTIC
     """Best suited for queries expressed in natural language as opposed to keywords. Improves
     #: precision of search results by re-ranking the top search results using a ranking model trained
     #: on the Web corpus.""
 
     More Details: https://learn.microsoft.com/en-us/azure/search/search-query-overview
 
+**Available Vector Filter Mode:**
+
+- POST_FILTER = "postFilter"
+    """The filter will be applied after the candidate set of vector results is returned. Depending on
+    #: the filter selectivity, this can result in fewer results than requested by the parameter 'k'."""
+
+- PRE_FILTER = "preFilter"
+    """The filter will be applied before the search query."""
+
+    More Details: https://learn.microsoft.com/en-us/azure/search/vector-search-filters
+
+**Note**
+
+- The `AzureAISearchRM` client allows you to perform Vector search, Hybrid search, or Full text search.
+- By default, the `AzureAISearchRM` client uses the Azure OpenAI Client for generating embeddings. If you want to use something else, you can provide your custom embedding_func, but either the openai_client or embedding_func must be provided.
+- If you need to enable semantic search, either with vector, hybrid, or full text search, then set the `semantic_ranker` flag to True.
+- If `semantic_ranker` is True, always set the `query_type` to QueryType.SEMANTIC and always provide the `semantic_configuration_name`.
+
 Example of the AzureAISearchRM constructor:
 
 ```python
@@ -56,14 +83,22 @@ AzureAISearchRM(
     search_api_key: str,
     search_index_name: str,
     field_text: str,
+    field_vector: Optional[str] = None,
     k: int = 3,
+    azure_openai_client: Optional[openai.AzureOpenAI] = None,
+    openai_embed_model: Optional[str] = "text-embedding-ada-002",
+    embedding_func: Optional[Callable] = None,
     semantic_ranker: bool = False,
     filter: str = None,
     query_language: str = "en-Us",
     query_speller: str = "lexicon",
     use_semantic_captions: bool = False,
     query_type: Optional[QueryType] = QueryType.FULL,
-    semantic_configuration_name: str = None
+    semantic_configuration_name: str = None,
+    is_vector_search: Optional[bool] = False,
+    is_hybrid_search: Optional[bool] = False,
+    is_fulltext_search: Optional[bool] = True,
+    vector_filter_mode: Optional[VectorFilterMode.PRE_FILTER] = None
 )
 ```
 
@@ -128,6 +163,31 @@ for result in retrieval_response:
     print("Text:", result.long_text, "\n")
 ```
 
+3. Example of Semantic Hybrid Search.
+
+```python
+from dspy.retrieve.azureaisearch_rm import AzureAISearchRM
+
+azure_search = AzureAISearchRM(
+    search_service_name="search_service_name",
+    search_api_key="search_api_key",
+    search_index_name="search_index_name",
+    field_text="field_text",
+    field_vector="field_vector",
+    k=3,
+    azure_openai_client="azure_openai_client",
+    openai_embed_model="text-embedding-ada-002"
+    semantic_ranker=True,
+    query_type=QueryType.SEMANTIC,
+    semantic_configuration_name="semantic_configuration_name",
+    is_hybrid_search=True,
+)
+
+retrieval_response = azure_search("What is Thermodynamics", k=3)
+for result in retrieval_response:
+    print("Text:", result.long_text, "\n")
+```
+
 ***
 
 <AuthorDetails name="Prajapati Harishkumar Kishorkumar"/>
@@ -82,7 +82,7 @@ Now we have the baseline pipeline ready to use, so let's try using the `COPRO` t
 Let's start by importing and initializing our teleprompter, for the metric we'll be using the same `validate_context_and_answer` imported and used above:
 
 ```python
-from dspy.teleprompt import COPRP
+from dspy.teleprompt import COPRO
 
 teleprompter = COPRO(
     metric=validate_context_and_answer,
 
@@ -70,7 +70,7 @@ Open source libraries such as [RAGautouille](https://github.com/bclavie/ragatoui
 
 - **How do I turn off the cache? How do I export the cache?**
 
-You can turn off the cache by setting the [`cache_turn_on` flag to `False`](https://github.com/stanfordnlp/dspy/blob/9d8a40c477b9dd6dcdc007647b5b9ddad2b5657a/dsp/modules/cache_utils.py#L10).
+You can turn off the cache by setting the [`DSP_CACHEBOOL`](https://github.com/stanfordnlp/dspy/blob/main/dsp/modules/cache_utils.py#L9) environment variable to `False`, which disables the `cache_turn_on` flag.
 
 Your local cache will be saved to the global env directory `os.environ["DSP_NOTEBOOK_CACHEDIR"]` which you can usually set to `os.path.join(repo_path, 'cache')` and export this cache from here.
 
 
@@ -8,7 +8,7 @@ import AuthorDetails from '@site/src/components/AuthorDetails';
 
 In this post, we walk you through a minimal working example using the DSPy library. 
 
-We make use of the GSM8K dataset and the OpenAI GPT-3.5-turbo model to simulate prompting tasks within DSPy.
+We make use of the [GSM8K dataset](https://huggingface.co/datasets/gsm8k) and the OpenAI GPT-3.5-turbo model to simulate prompting tasks within DSPy.
 
 ## Setup
 
@@ -27,9 +27,17 @@ gsm8k = GSM8K()
 gsm8k_trainset, gsm8k_devset = gsm8k.train[:10], gsm8k.dev[:10]
 ```
 
+Let's take a look at what `gsm8k_trainset` and `gsm8k_devset` are:
+
+```python
+print(gsm8k_trainset)
+```
+
+The `gsm8k_trainset` and `gsm8k_devset` datasets contain a list of Examples with each example having `question` and `answer` field. We'll use these datasets to train and evaluate our model.
+
 ## Define the Module
 
-With our environment set up, let's define a custom program that utilizes the `ChainOfThought` module to perform step-by-step reasoning to generate answers:
+With our environment set up, let's define a custom program that utilizes the [`ChainOfThought`](/api/modules/ChainOfThought) module to perform step-by-step reasoning to generate answers:
 
 ```python
 class CoT(dspy.Module):
@@ -43,7 +51,7 @@ class CoT(dspy.Module):
 
 ## Compile and Evaluate the Model
 
-With our simple program in place, let's move on to optimizing it using the `BootstrapFewShotWithRandomSearch` teleprompter:
+With our simple program in place, let's move on to optimizing it using the [`BootstrapFewShot`](/api/optimizers/BootstrapFewShot) teleprompter:
 
 ```python
 from dspy.teleprompt import BootstrapFewShot
 
@@ -22,6 +22,6 @@ sidebar_position: 99999
 
 - [DSPy talk at ScaleByTheBay Nov 2023](https://www.youtube.com/watch?v=Dt3H2ninoeY).
 - [DSPy webinar with MLOps Learners](https://www.youtube.com/watch?v=im7bCLW2aM4), a bit longer with Q&A.
-- Hands-on Overviews of DSPy by the community: [DSPy Explained! by Connor Shorten](https://www.youtube.com/watch?v=41EfOY0Ldkc), [DSPy explained by code_your_own_ai](https://www.youtube.com/watch?v=ycfnKPxBMck)
+- Hands-on Overviews of DSPy by the community: [DSPy Explained! by Connor Shorten](https://www.youtube.com/watch?v=41EfOY0Ldkc), [DSPy explained by code_your_own_ai](https://www.youtube.com/watch?v=ycfnKPxBMck), [DSPy Crash Course by AI Bites](https://youtu.be/5-zgASQKkKQ?si=3gnmVouT5_rpk_nu)
 - Interviews: [Weaviate Podcast in-person](https://www.youtube.com/watch?v=CDung1LnLbY), and you can find 6-7 other remote podcasts on YouTube from a few different perspectives/audiences.
 - **Tracing in DSPy** with Arize Phoenix: [Tutorial for tracing your prompts and the steps of your DSPy programs](https://colab.research.google.com/github/Arize-ai/phoenix/blob/main/tutorials/tracing/dspy_tracing_tutorial.ipynb)
@@ -10,7 +10,7 @@ RAG ensures LLMs can dynamically utilize real-time knowledge even if not origina
 
 ## Configuring LM and RM
 
-We'll start by setting up the language model (LM) and retrieval model (RM), which **DSPy** supports through multiple [LM](https://dspy-docs.vercel.app/docs/category/language-model-clients) and [RM](https://dspy-docs.vercel.app/docs/category/retrieval-model-clients) APIs and [local models hosting](https://dspy-docs.vercel.app/docs/category/local-language-model-clients).
+We'll start by setting up the language model (LM) and retrieval model (RM), which **DSPy** supports through multiple [LM](https://dspy-docs.vercel.app/docs/category/language-model-clients) and [RM](https://dspy-docs.vercel.app/docs/category/retrieval-model-clients) APIs and [local models hosting](https://dspy-docs.vercel.app/docs/deep-dive/language_model_clients/local_models/HFClientTGI).
 
 In this notebook, we'll work with GPT-3.5 (`gpt-3.5-turbo`) and the `ColBERTv2` retriever (a free server hosting a Wikipedia 2017 "abstracts" search index containing the first paragraph of each article from this [2017 dump](https://hotpotqa.github.io/wiki-readme.html)). We configure the LM and RM within DSPy, allowing DSPy to internally call the respective module when needed for generation or retrieval.