Skip to content

Commit b0dd257

Browse files
committed
Merge infrastructure and runtime page
1 parent 0705361 commit b0dd257

File tree

4 files changed

+261
-249
lines changed

4 files changed

+261
-249
lines changed
Lines changed: 260 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,260 @@
1+
Infrastructure and Runtime
2+
**************************
3+
4+
This page describes the configurations of **Infrastructure** and **Runtime** defining a Data Science Job.
5+
6+
.. include:: ../jobs/components/toc_local.rst
7+
8+
Example
9+
=======
10+
11+
The following example configures the infrastructure and runtime to run a Python script.
12+
13+
.. include:: ../jobs/tabs/quick_start_job.rst
14+
15+
Infrastructure
16+
==============
17+
18+
The Data Science Job infrastructure is defined by a :py:class:`~ads.jobs.DataScienceJob` instance.
19+
For example:
20+
21+
.. code-block:: python3
22+
23+
from ads.jobs import DataScienceJob
24+
25+
infrastructure = (
26+
DataScienceJob()
27+
.with_compartment_id("<compartment_ocid>")
28+
.with_project_id("<project_ocid>")
29+
.with_subnet_id("<subnet_ocid>")
30+
.with_shape_name("VM.Standard.E3.Flex")
31+
# Shape config details are applicable only for the flexible shapes.
32+
.with_shape_config_details(memory_in_gbs=16, ocpus=1)
33+
# Minimum block storage size is 50 (GB)
34+
.with_block_storage_size(50)
35+
.with_log_group_id("<log_group_ocid>")
36+
.with_log_id("<log_ocid>")
37+
)
38+
39+
When creating a :py:class:`~ads.jobs.DataScienceJob` instance, the following configurations are required:
40+
41+
* Compartment ID
42+
* Project ID
43+
* Compute Shape
44+
45+
The following configurations are optional:
46+
47+
* Block Storage Size, defaults to 50 (GB)
48+
* Log Group ID
49+
* Log ID
50+
51+
For more details about the mandatory and optional parameters, see :py:class:`~ads.jobs.DataScienceJob`.
52+
53+
Using Configurations from Notebook
54+
----------------------------------
55+
56+
If you are creating a job from an OCI Data Science
57+
`Notebook Session <https://docs.oracle.com/en-us/iaas/data-science/using/manage-notebook-sessions.htm>`_,
58+
the same infrastructure configurations from the notebook session will be used as defaults.
59+
You can initialize the :py:class:`~ads.jobs.DataScienceJob`
60+
with the logging configurations and override the other options as needed. For example:
61+
62+
.. code-block:: python3
63+
64+
from ads.jobs import DataScienceJob
65+
66+
infrastructure = (
67+
DataScienceJob()
68+
.with_log_group_id("<log_group_ocid>")
69+
.with_log_id("<log_ocid>")
70+
# Use a GPU shape for the job,
71+
# regardless of the shape used by the notebook session
72+
.with_shape_name("VM.GPU2.1")
73+
# compartment ID, project ID, subnet ID and block storage will be
74+
# the same as the ones set in the notebook session
75+
)
76+
77+
Compute Shapes
78+
--------------
79+
80+
The :py:class:`~ads.jobs.DataScienceJob` class provides two static methods to obtain the support compute shapes:
81+
82+
* You can get a list of currently supported compute shapes by calling
83+
:py:meth:`~ads.jobs.DataScienceJob.instance_shapes`.
84+
* can get a list of shapes that are available for fast launch by calling
85+
:py:meth:`~ads.jobs.DataScienceJob.fast_launch_shapes`.
86+
Specifying a fast launch shape will allow your job to start as fast as possible.
87+
88+
Networking
89+
----------
90+
91+
Data Science Job offers two types of networking: default networking (managed egress) and custom networking.
92+
Default networking allows job runs to access public internet through a NAT gateway and OCI service through
93+
a service gateway, both are configured automatically. Custom networking requires you to specify a subnet ID.
94+
You can control the network access through the subnet and security lists.
95+
96+
If you specified a subnet ID, your job will be configured to have custom networking.
97+
Otherwise, default networking will be used. Note that when you are in a Data Science Notebook Session,
98+
the same networking configuration is be used by default.
99+
You can specify the networking manually by calling :py:meth:`~ads.jobs.DataScienceJob.with_job_infrastructure_type()`.
100+
101+
Logging
102+
-------
103+
104+
Logging is not required to create the job.
105+
However, it is highly recommended to enable logging for debugging and monitoring.
106+
107+
In the preceding example, both the log OCID and corresponding log group OCID are specified
108+
with the :py:class:`~ads.jobs.DataScienceJob` instance.
109+
If your administrator configured the permission for you to search for logging resources,
110+
you can skip specifying the log group OCID because ADS can automatically retrieve it.
111+
112+
If you specify only the log group OCID and no log OCID,
113+
a new Log resource is automatically created within the log group to store the logs,
114+
see also `ADS Logging <../logging/logging.html>`_.
115+
116+
Runtime
117+
=======
118+
119+
The *runtime* of a job defines the source code of your workload, environment variables, CLI arguments
120+
and other configurations for the environment to run the workload.
121+
122+
Depending on the source code, ADS provides different types of *runtime* for defining a data science job,
123+
including:
124+
125+
.. include:: ../jobs/components/runtime_types.rst
126+
127+
128+
Environment Variables
129+
---------------------
130+
131+
You can set environment variables for a runtime by calling
132+
:py:meth:`~ads.jobs.PythonRuntime.with_environment_variable()`.
133+
Environment variables enclosed by ``${...}`` will be substituted. For example:
134+
135+
.. include:: ../jobs/tabs/runtime_envs.rst
136+
137+
.. code-block:: python3
138+
139+
for k, v in runtime.environment_variables.items():
140+
print(f"{k}: {v}")
141+
142+
will show the following environment variables for the runtime:
143+
144+
.. code-block:: text
145+
146+
HOST: 10.0.0.1
147+
PORT: 443
148+
URL: http://10.0.0.1:443/path/
149+
ESCAPED_URL: http://${HOST}:${PORT}/path/
150+
MISSING_VAR: This is ${UNDEFINED}
151+
VAR_WITH_DOLLAR: $10
152+
DOUBLE_DOLLAR: $10
153+
154+
Note that:
155+
156+
* You can use ``$$`` to escape the substitution.
157+
* Undefined variable enclosed by ``${...}`` will be ignored.
158+
* Double dollar signs ``$$`` will be substituted by a single one ``$``.
159+
160+
See also:
161+
`Service Provided Environment Variables <https://docs.oracle.com/en-us/iaas/data-science/using/jobs-env-vars.htm>`_
162+
163+
164+
.. _runtime_args:
165+
166+
Command Line Arguments
167+
----------------------
168+
169+
The command line arguments for running your script or function can be configured by calling
170+
:py:meth:`~ads.jobs.PythonRuntime.with_argument()`. For example:
171+
172+
.. tabs::
173+
174+
.. code-tab:: python
175+
:caption: Python
176+
177+
from ads.jobs import PythonRuntime
178+
179+
runtime = (
180+
PythonRuntime()
181+
.with_source("oci://bucket_name@namespace/path/to/script.py")
182+
.with_argument(
183+
"arg1", "arg2",
184+
key1="val1",
185+
key2="val2"
186+
)
187+
)
188+
189+
.. code-tab:: yaml
190+
:caption: YAML
191+
192+
kind: runtime
193+
type: python
194+
spec:
195+
scriptPathURI: oci://bucket_name@namespace/path/to/script.py
196+
args:
197+
- arg1
198+
- arg2
199+
- --key1
200+
- val1
201+
- --key2
202+
- val2
203+
204+
will configured the job to call your script by:
205+
206+
.. code-block:: bash
207+
208+
python script.py arg1 arg2 --key1 val1 --key2 val2
209+
210+
You can call :py:meth:`~ads.jobs.PythonRuntime.with_argument()` multiple times to set the arguments
211+
to your desired order. You can check ``runtime.args`` to see the added arguments.
212+
213+
Here are a few more examples:
214+
215+
.. include:: ../jobs/tabs/runtime_args.rst
216+
217+
Conda Environment
218+
-----------------
219+
220+
Except for :py:class:`~ads.jobs.ContainerRuntime`,
221+
all the other runtime options allow you to configure a
222+
`Conda Environment <https://docs.oracle.com/en-us/iaas/data-science/using/conda_understand_environments.htm>`_
223+
for your workload. You can use the slug name to specify a
224+
`conda environment provided by the data science service
225+
<https://docs.oracle.com/en-us/iaas/data-science/using/conda_viewing.htm#conda-dsenvironments>`_.
226+
For example, to use the TensorFlow conda environment:
227+
228+
.. include:: ../jobs/tabs/runtime_service_conda.rst
229+
230+
You can also use a custom conda environment published to OCI Object Storage
231+
by passing the ``uri`` to :py:meth:`~ads.jobs.PythonRuntime.with_custom_conda`,
232+
for example:
233+
234+
.. include:: ../jobs/tabs/runtime_custom_conda.rst
235+
236+
By default, ADS will try to determine the region based on the authenticated API key or resource principal.
237+
If your custom conda environment is stored in a different region,
238+
you can specify the ``region`` when calling :py:meth:`~ads.jobs.PythonRuntime.with_custom_conda`.
239+
240+
For more details on custom conda environment, see
241+
`Publishing a Conda Environment to an Object Storage Bucket in Your Tenancy
242+
<https://docs.oracle.com/en-us/iaas/data-science/using/conda_publishs_object.htm>`__.
243+
244+
245+
Override Configurations
246+
-----------------------
247+
248+
When you call :py:meth:`ads.jobs.Job.run`, a new job run will be started with the configuration defined in the **job**.
249+
You may want to override the configuration with custom variables. For example,
250+
you can customize job run display name, override command line argument, specify additional environment variables,
251+
and add free form tags:
252+
253+
.. code-block:: python3
254+
255+
job_run = job.run(
256+
name="<my_job_run_name>",
257+
args="new_arg --new_key new_val",
258+
env_var={"new_env": "new_val"},
259+
freeform_tags={"new_tag": "new_tag_val"}
260+
)

docs/source/user_guide/jobs/infrastructure.rst

Lines changed: 0 additions & 100 deletions
This file was deleted.

0 commit comments

Comments
 (0)