Skip to content

Commit 97726fb

Browse files
author
Ziqun Ye
authored
Add vectordb deployment doc (#476)
2 parents fa938cd + 83c036f commit 97726fb

File tree

3 files changed

+223
-1
lines changed

3 files changed

+223
-1
lines changed

ads/llm/serializers/retrieval_qa.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ def load(config: dict, **kwargs):
3030
os.environ.get("OCI_OPENSEARCH_PASSWORD", None),
3131
),
3232
verify_certs=True
33-
if os.environ.get("OCI_OPENSEARCH_VERIFY_CERTS", None).lower() == "true"
33+
if os.environ.get("OCI_OPENSEARCH_VERIFY_CERTS", None) == "True"
3434
else False,
3535
ca_certs=os.environ.get("OCI_OPENSEARCH_CA_CERTS", None),
3636
)

docs/source/user_guide/large_language_model/index.rst

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,37 @@
44
Large Language Model
55
####################
66

7+
Oracle Cloud Infrastructure (OCI) provides fully managed infrastructure to work with Large Language Model (LLM).
8+
9+
Train and Deploy LLM
10+
********************
11+
You can train LLM at scale with multi-node and multi-GPU using `Data Science Jobs (Jobs) <https://docs.oracle.com/en-us/iaas/data-science/using/jobs-about.htm>`_, and deploy it with `Data Science Model Deployment (Model Deployments) <https://docs.oracle.com/en-us/iaas/data-science/using/model-dep-about.htm>`_. The following blog posts show examples training and deploying Llama2 models:
12+
13+
* `Multi-GPU multinode fine-tuning Llama2 on OCI Data Science <https://blogs.oracle.com/ai-and-datascience/post/multi-gpu-multi-node-finetuning-llama2-oci>`_
14+
* `Deploy Llama 2 in OCI Data Science <https://blogs.oracle.com/ai-and-datascience/post/llama2-oci-data-science-cloud-platform>`_
15+
* `Quantize and deploy Llama 2 70B on cost-effective NVIDIA A10 Tensor Core GPUs in OCI Data Science <https://blogs.oracle.com/ai-and-datascience/post/quantize-deploy-llama2-70b-costeffective-a10s-oci>`_
16+
17+
18+
Integration with LangChain
19+
**************************
20+
ADS is designed to work with LangChain, enabling developers to incorporate various LangChain components and models deployed on OCI seamlessly into their applications. Additionally, ADS can package LangChain applications and deploy it as a REST API endpoint using OCI Data Science Model Deployment.
21+
22+
23+
.. admonition:: Installation
24+
:class: note
25+
26+
Install ADS and other dependencies for LLM integrations.
27+
28+
.. code-block:: bash
29+
30+
$ python3 -m pip install "oracle-ads[llm]"
31+
32+
733
834
.. toctree::
935
:hidden:
1036
:maxdepth: 2
1137

38+
training_llm
1239
deploy_langchain_application
40+
retrieval
Lines changed: 194 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,194 @@
1+
.. _vector_store:
2+
3+
########################
4+
Vector Store integration
5+
########################
6+
7+
.. versionadded:: 2.9.1
8+
9+
Current version of Langchain does not support serialization of any vector stores. This will be a problem when you want to deploy a langchain application with the vector store being one of the components using data science model deployment service. To solve this problem, we extended our support of vector stores serialization:
10+
11+
- ``OpenSearchVectorSearch``
12+
- ``FAISS``
13+
14+
OpenSearchVectorSearch Serialization
15+
------------------------------------
16+
17+
langchain does not automatically support serialization of ``OpenSearchVectorSearch``. However, ADS provides a way to serialize ``OpenSearchVectorSearch``. To serialize ``OpenSearchVectorSearch``, you need to use environment variables to pass in the credentials. The following variables can be passed in through the corresponding environment variables:
18+
19+
- http_auth: (``OCI_OPENSEARCH_USERNAME``, ``OCI_OPENSEARCH_PASSWORD``)
20+
- verify_certs: ``OCI_OPENSEARCH_VERIFY_CERTS``
21+
- ca_certs: ``OCI_OPENSEARCH_CA_CERTS``
22+
23+
The following code snippet shows how to use ``OpenSearchVectorSearch`` with environment variables:
24+
25+
.. code-block:: python3
26+
27+
os.environ['OCI_OPENSEARCH_USERNAME'] = "username"
28+
os.environ['OCI_OPENSEARCH_PASSWORD'] = "password"
29+
os.environ['OCI_OPENSEARCH_VERIFY_CERTS'] = "False"
30+
31+
INDEX_NAME = "your_index_name"
32+
opensearch_vector_search = OpenSearchVectorSearch(
33+
"https://localhost:9200",
34+
embedding_function=oci_embedings,
35+
index_name=INDEX_NAME,
36+
engine="lucene",
37+
http_auth=(os.environ["OCI_OPENSEARCH_USERNAME"], os.environ["OCI_OPENSEARCH_PASSWORD"]),
38+
verify_certs=os.environ["OCI_OPENSEARCH_VERIFY_CERTS"],
39+
)
40+
41+
.. admonition:: Deployment
42+
:class: note
43+
44+
During deployment, it is very important that you remember to pass in those environment variables as well:
45+
46+
.. code-block:: python3
47+
48+
.deploy(deployment_log_group_id="ocid1.loggroup.####",
49+
deployment_access_log_id="ocid1.log.####",
50+
deployment_predict_log_id="ocid1.log.####",
51+
environment_variables={"OCI_OPENSEARCH_USERNAME":"<oci_opensearch_username>",
52+
"OCI_OPENSEARCH_PASSWORD": "<oci_opensearch_password>",
53+
"OCI_OPENSEARCH_VERIFY_CERTS": "<oci_opensearch_verify_certs>",)
54+
55+
OpenSearchVectorSearch Deployment
56+
---------------------------------
57+
58+
Here is an example code snippet for OpenSearchVectorSearch deployment:
59+
60+
.. code-block:: python3
61+
62+
from langchain.vectorstores import OpenSearchVectorSearch
63+
from ads.llm import GenerativeAIEmbeddings, GenerativeAI
64+
import ads
65+
66+
ads.set_auth("resource_principal")
67+
68+
oci_embedings = GenerativeAIEmbeddings(
69+
compartment_id="ocid1.compartment.oc1..aaaaaaaapvb3hearqum6wjvlcpzm5ptfxqa7xfftpth4h72xx46ygavkqteq",
70+
client_kwargs=dict(service_endpoint="https://generativeai.aiservice.us-chicago-1.oci.oraclecloud.com") # this can be omitted after Generative AI service is GA.
71+
)
72+
73+
oci_llm = GenerativeAI(
74+
compartment_id="ocid1.compartment.oc1..aaaaaaaapvb3hearqum6wjvlcpzm5ptfxqa7xfftpth4h72xx46ygavkqteq",
75+
client_kwargs=dict(service_endpoint="https://generativeai.aiservice.us-chicago-1.oci.oraclecloud.com") # this can be omitted after Generative AI service is GA.
76+
)
77+
78+
import os
79+
os.environ['OCI_OPENSEARCH_USERNAME'] = "username"
80+
os.environ['OCI_OPENSEARCH_PASSWORD'] = "password"
81+
os.environ['OCI_OPENSEARCH_VERIFY_CERTS'] = "True" # make sure this is capitalized.
82+
os.environ['OCI_OPENSEARCH_CA_CERTS'] = "path/to/oci_opensearch_ca.pem"
83+
84+
INDEX_NAME = "your_index_name"
85+
opensearch_vector_search = OpenSearchVectorSearch(
86+
"https://localhost:9200", # your endpoint
87+
embedding_function=oci_embedings,
88+
index_name=INDEX_NAME,
89+
engine="lucene",
90+
http_auth=(os.environ["OCI_OPENSEARCH_USERNAME"], os.environ["OCI_OPENSEARCH_PASSWORD"]),
91+
verify_certs=os.environ["OCI_OPENSEARCH_VERIFY_CERTS"],
92+
ca_certs=os.environ["OCI_OPENSEARCH_CA_CERTS"],
93+
)
94+
from langchain.chains import RetrievalQA
95+
retriever = opensearch_vector_search.as_retriever(search_kwargs={"vector_field": "embeds",
96+
"text_field": "text",
97+
"k": 3,
98+
"size": 3},
99+
max_tokens_limit=1000)
100+
qa = RetrievalQA.from_chain_type(
101+
llm=oci_llm,
102+
chain_type="stuff",
103+
retriever=retriever,
104+
chain_type_kwargs={
105+
"verbose": True
106+
}
107+
)
108+
from ads.llm.deploy import ChainDeployment
109+
model = ChainDeployment(qa)
110+
model.prepare(force_overwrite=True,
111+
inference_conda_env="your_conda_pack",
112+
)
113+
114+
model.save()
115+
res = model.verify("your prompt")
116+
model.deploy(deployment_log_group_id="ocid1.loggroup.####",
117+
deployment_access_log_id="ocid1.log.####",
118+
deployment_predict_log_id="ocid1.log.####",
119+
environment_variables={"OCI_OPENSEARCH_USERNAME":"<oci_opensearch_username>",
120+
"OCI_OPENSEARCH_PASSWORD": "<oci_opensearch_password>",
121+
"OCI_OPENSEARCH_VERIFY_CERTS": "<oci_opensearch_verify_certs>",
122+
"OCI_OPENSEARCH_CA_CERTS": "<oci_opensearch_ca_certs>"},)
123+
124+
model.predict("your prompt")
125+
126+
127+
FAISS Serialization
128+
-------------------
129+
130+
If your documents are not too large and you dont have a OCI OpenSearch cluster, you can use ``FAISS`` as your in-memory vector store, which can also do similarty search very efficiently. For ``FAISS``, you can just use it and deploy it as it is.
131+
132+
133+
FAISS Deployment
134+
----------------
135+
136+
Here is an example code snippet for FAISS deployment:
137+
138+
.. code-block:: python3
139+
140+
import ads
141+
from ads.llm import GenerativeAIEmbeddings, GenerativeAI
142+
from langchain.document_loaders import TextLoader
143+
from langchain.text_splitter import CharacterTextSplitter
144+
from langchain.vectorstores import FAISS
145+
from langchain.chains import RetrievalQA
146+
147+
ads.set_auth("resource_principal")
148+
oci_embedings = GenerativeAIEmbeddings(
149+
compartment_id="ocid1.compartment.####",
150+
client_kwargs=dict(service_endpoint="https://generativeai.aiservice.us-chicago-1.oci.oraclecloud.com") # this can be omitted after Generative AI service is GA.
151+
)
152+
153+
oci_llm = GenerativeAI(
154+
compartment_id="ocid1.compartment.####",
155+
client_kwargs=dict(service_endpoint="https://generativeai.aiservice.us-chicago-1.oci.oraclecloud.com") # this can be omitted after Generative AI service is GA.
156+
)
157+
158+
loader = TextLoader("your.txt")
159+
documents = loader.load()
160+
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=50)
161+
docs = text_splitter.split_documents(documents)
162+
163+
l = len(docs)
164+
embeddings = []
165+
for i in range(l // 16 + 1):
166+
subdocs = [item.page_content for item in docs[i * 16: (i + 1) * 16]]
167+
embeddings.extend(oci_embedings.embed_documents(subdocs))
168+
169+
texts = [item.page_content for item in docs]
170+
text_embedding_pairs = [(text, embed) for text, embed in zip(texts, embeddings)]
171+
db = FAISS.from_embeddings(text_embedding_pairs, oci_embedings)
172+
173+
retriever = db.as_retriever()
174+
qa = RetrievalQA.from_chain_type(
175+
llm=oci_llm,
176+
chain_type="stuff",
177+
retriever=retriever,
178+
chain_type_kwargs={
179+
"verbose": True
180+
}
181+
)
182+
183+
from ads.llm.deploy import ChainDeployment
184+
model.prepare(force_overwrite=True,
185+
inference_conda_env="your_conda_pack",
186+
)
187+
188+
model.save()
189+
res = model.verify("your prompt")
190+
model.deploy(deployment_log_group_id="ocid1.loggroup.####",
191+
deployment_access_log_id="ocid1.log.####",
192+
deployment_predict_log_id="ocid1.log.####")
193+
194+
model.predict("your prompt")

0 commit comments

Comments
 (0)