Skip to content

Commit 7b1e49a

Browse files
Merge pull request #815 from HARISHKUMAR1112001/add-vector-support-in-azure-ai-search
feat(dspy): add vector, hybrid and fulltext search support in azure ai search module
2 parents 40b3f49 + 4067d9a commit 7b1e49a

File tree

2 files changed

+313
-57
lines changed

2 files changed

+313
-57
lines changed

docs/docs/deep-dive/retrieval_models_clients/Azure.mdx

Lines changed: 65 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -22,32 +22,59 @@ The constructor initializes an instance of the `AzureAISearchRM` class and sets
2222
- `search_api_key` (str): The API key for accessing the Azure AI Search service.
2323
- `search_index_name` (str): The name of the search index in the Azure AI Search service.
2424
- `field_text` (str): The name of the field containing text content in the search index. This field will be mapped to the "content" field in the dsp framework.
25+
- `field_vector` (Optional[str]): The name of the field containing vector content in the search index.
2526
- `k` (int, optional): The default number of top passages to retrieve. Defaults to 3.
27+
- `azure_openai_client` (Optional[openai.AzureOpenAI]): An instance of the AzureOpenAI client. Either openai_client or embedding_func must be provided. Defaults to None.
28+
- `openai_embed_model` (Optional[str]): The name of the OpenAI embedding model. Defaults to "text-embedding-ada-002".
29+
- `embedding_func` (Optional[Callable]): A function for generating embeddings. Either openai_client or embedding_func must be provided. Defaults to None.
2630
- `semantic_ranker` (bool, optional): Whether to use semantic ranking. Defaults to False.
2731
- `filter` (str, optional): Additional filter query. Defaults to None.
2832
- `query_language` (str, optional): The language of the query. Defaults to "en-Us".
2933
- `query_speller` (str, optional): The speller mode. Defaults to "lexicon".
3034
- `use_semantic_captions` (bool, optional): Whether to use semantic captions. Defaults to False.
3135
- `query_type` (Optional[QueryType], optional): The type of query. Defaults to QueryType.FULL.
3236
- `semantic_configuration_name` (str, optional): The name of the semantic configuration. Defaults to None.
37+
- `is_vector_search` (Optional[bool]): Whether to enable vector search. Defaults to False.
38+
- `is_hybrid_search` (Optional[bool]): Whether to enable hybrid search. Defaults to False.
39+
- `is_fulltext_search` (Optional[bool]): Whether to enable fulltext search. Defaults to True.
40+
- `vector_filter_mode` (Optional[VectorFilterMode]): The vector filter mode. Defaults to None.
3341

34-
Available Query Types:
3542

36-
SIMPLE
43+
**Available Query Types:**
44+
45+
- SIMPLE
3746
"""Uses the simple query syntax for searches. Search text is interpreted using a simple query
3847
#: language that allows for symbols such as +, * and "". Queries are evaluated across all
3948
#: searchable fields by default, unless the searchFields parameter is specified."""
40-
FULL
49+
- FULL
4150
"""Uses the full Lucene query syntax for searches. Search text is interpreted using the Lucene
4251
#: query language which allows field-specific and weighted searches, as well as other advanced
4352
#: features."""
44-
SEMANTIC
53+
- SEMANTIC
4554
"""Best suited for queries expressed in natural language as opposed to keywords. Improves
4655
#: precision of search results by re-ranking the top search results using a ranking model trained
4756
#: on the Web corpus.""
4857

4958
More Details: https://learn.microsoft.com/en-us/azure/search/search-query-overview
5059

60+
**Available Vector Filter Mode:**
61+
62+
- POST_FILTER = "postFilter"
63+
"""The filter will be applied after the candidate set of vector results is returned. Depending on
64+
#: the filter selectivity, this can result in fewer results than requested by the parameter 'k'."""
65+
66+
- PRE_FILTER = "preFilter"
67+
"""The filter will be applied before the search query."""
68+
69+
More Details: https://learn.microsoft.com/en-us/azure/search/vector-search-filters
70+
71+
**Note**
72+
73+
- The `AzureAISearchRM` client allows you to perform Vector search, Hybrid search, or Full text search.
74+
- By default, the `AzureAISearchRM` client uses the Azure OpenAI Client for generating embeddings. If you want to use something else, you can provide your custom embedding_func, but either the openai_client or embedding_func must be provided.
75+
- If you need to enable semantic search, either with vector, hybrid, or full text search, then set the `semantic_ranker` flag to True.
76+
- If `semantic_ranker` is True, always set the `query_type` to QueryType.SEMANTIC and always provide the `semantic_configuration_name`.
77+
5178
Example of the AzureAISearchRM constructor:
5279

5380
```python
@@ -56,14 +83,22 @@ AzureAISearchRM(
5683
search_api_key: str,
5784
search_index_name: str,
5885
field_text: str,
86+
field_vector: Optional[str] = None,
5987
k: int = 3,
88+
azure_openai_client: Optional[openai.AzureOpenAI] = None,
89+
openai_embed_model: Optional[str] = "text-embedding-ada-002",
90+
embedding_func: Optional[Callable] = None,
6091
semantic_ranker: bool = False,
6192
filter: str = None,
6293
query_language: str = "en-Us",
6394
query_speller: str = "lexicon",
6495
use_semantic_captions: bool = False,
6596
query_type: Optional[QueryType] = QueryType.FULL,
66-
semantic_configuration_name: str = None
97+
semantic_configuration_name: str = None,
98+
is_vector_search: Optional[bool] = False,
99+
is_hybrid_search: Optional[bool] = False,
100+
is_fulltext_search: Optional[bool] = True,
101+
vector_filter_mode: Optional[VectorFilterMode.PRE_FILTER] = None
67102
)
68103
```
69104

@@ -128,6 +163,31 @@ for result in retrieval_response:
128163
print("Text:", result.long_text, "\n")
129164
```
130165

166+
3. Example of Semantic Hybrid Search.
167+
168+
```python
169+
from dspy.retrieve.azureaisearch_rm import AzureAISearchRM
170+
171+
azure_search = AzureAISearchRM(
172+
search_service_name="search_service_name",
173+
search_api_key="search_api_key",
174+
search_index_name="search_index_name",
175+
field_text="field_text",
176+
field_vector="field_vector",
177+
k=3,
178+
azure_openai_client="azure_openai_client",
179+
openai_embed_model="text-embedding-ada-002"
180+
semantic_ranker=True,
181+
query_type=QueryType.SEMANTIC,
182+
semantic_configuration_name="semantic_configuration_name",
183+
is_hybrid_search=True,
184+
)
185+
186+
retrieval_response = azure_search("What is Thermodynamics", k=3)
187+
for result in retrieval_response:
188+
print("Text:", result.long_text, "\n")
189+
```
190+
131191
***
132192

133193
<AuthorDetails name="Prajapati Harishkumar Kishorkumar"/>

0 commit comments

Comments
 (0)