Skip to content

Commit 093c986

Browse files
committed
Update data_science_job.rst and run_python.rst
1 parent aad957e commit 093c986

File tree

4 files changed

+340
-341
lines changed

4 files changed

+340
-341
lines changed

docs/source/user_guide/jobs/data_science_job.rst

Lines changed: 29 additions & 99 deletions
Original file line numberDiff line numberDiff line change
@@ -6,109 +6,20 @@ Quick Start
66
Before creating a job, ensure that you have policies configured for Data Science resources.
77
See also: :doc:`policies` and `About Data Science Policies <https://docs.oracle.com/en-us/iaas/data-science/using/policies.htm>`_.
88

9-
In ADS, a job is defined by :doc:`infrastructure` and :doc:`runtime`.
10-
The Data Science Job infrastructure is configured through a :py:class:`~ads.jobs.DataScienceJob` instance.
11-
The runtime can be an instance of :py:class:`~ads.jobs.PythonRuntime`,
12-
:py:class:`~ads.jobs.GitPythonRuntime`,
13-
:py:class:`~ads.jobs.NotebookRuntime`,
14-
:py:class:`~ads.jobs.ScriptRuntime`, or
15-
:py:class:`~ads.jobs.ContainerRuntime`
16-
9+
.. include:: ../jobs/components/toc_local.rst
1710

1811
Create and Run a Job
1912
====================
2013

14+
In ADS, a job is defined by :doc:`infrastructure` and :doc:`runtime`.
15+
The Data Science Job infrastructure is configured through a :py:class:`~ads.jobs.DataScienceJob` instance.
16+
The runtime can be an instance of:
17+
18+
.. include:: ../jobs/components/runtime_types.rst
19+
2120
Here is an example to define and run a Python :py:class:`~ads.jobs.Job`:
2221

23-
.. tabs::
24-
25-
.. code-tab:: python
26-
:caption: Python
27-
28-
from ads.jobs import Job, DataScienceJob, PythonRuntime
29-
30-
job = (
31-
Job(name="My Job")
32-
.with_infrastructure(
33-
DataScienceJob()
34-
.with_log_group_id("<log_group_ocid>")
35-
.with_log_id("<log_ocid>")
36-
# The following infrastructure configurations are optional
37-
# if you are in an OCI data science notebook session.
38-
# The configurations of the notebook session will be used as defaults.
39-
.with_compartment_id("<compartment_ocid>")
40-
.with_project_id("<project_ocid>")
41-
# For default networking, no need to specify subnet ID
42-
.with_subnet_id("<subnet_ocid>")
43-
.with_shape_name("VM.Standard.E3.Flex")
44-
# Shape config details are applicable only for the flexible shapes.
45-
.with_shape_config_details(memory_in_gbs=16, ocpus=1)
46-
.with_block_storage_size(50)
47-
)
48-
.with_runtime(
49-
PythonRuntime()
50-
# Specify the service conda environment by slug name.
51-
.with_service_conda("pytorch19_p37_cpu_v1")
52-
# The job artifact can be a single Python script, a directory or a zip file.
53-
.with_source("local/path/to/code_dir")
54-
# Set the working directory
55-
# When using a directory as source, the default working dir is the parent of code_dir.
56-
# Working dir should be a relative path beginning from the source directory (code_dir)
57-
.with_working_dir("code_dir")
58-
# The entrypoint is applicable only to directory or zip file as source
59-
# The entrypoint should be a path relative to the working dir.
60-
# Here my_script.py is a file in the code_dir/my_package directory
61-
.with_entrypoint("my_package/my_script.py")
62-
# Add an additional Python path, relative to the working dir (code_dir/other_packages).
63-
.with_python_path("other_packages")
64-
# Copy files in "code_dir/output" to object storage after job finishes.
65-
.with_output("output", "oci://bucket_name@namespace/path/to/dir")
66-
)
67-
)
68-
69-
# Create the job on OCI Data Science
70-
job.create()
71-
# Start a job run
72-
run = job.run()
73-
# Stream the job run outputs
74-
run.watch()
75-
76-
.. code-tab:: yaml
77-
:caption: YAML
78-
79-
kind: job
80-
spec:
81-
name: "My Job"
82-
infrastructure:
83-
kind: infrastructure
84-
type: dataScienceJob
85-
spec:
86-
blockStorageSize: 50
87-
compartmentId: <compartment_ocid>
88-
jobInfrastructureType: STANDALONE
89-
jobType: DEFAULT
90-
logGroupId: <log_group_ocid>
91-
logId: <log_ocid>
92-
projectId: <project_ocid>
93-
shapeConfigDetails:
94-
memoryInGBs: 16
95-
ocpus: 1
96-
shapeName: VM.Standard.E3.Flex
97-
subnetId: <subnet_ocid>
98-
runtime:
99-
kind: runtime
100-
type: python
101-
spec:
102-
conda:
103-
slug: pytorch19_p37_cpu_v1
104-
type: service
105-
entrypoint: my_package/my_script.py
106-
outputDir: output
107-
outputUri: oci://bucket_name@namespace/path/to/dir
108-
pythonPath:
109-
- other_packages
110-
scriptPathURI: local/path/to/code_dir
111-
workingDir: code_dir
22+
.. include:: ../jobs/tabs/python_runtime.rst
11223

11324
For more details, see :doc:`infrastructure` configurations and see :doc:`runtime` configurations.
11425

@@ -140,7 +51,7 @@ Here is an example of the logs:
14051
YAML
14152
====
14253

143-
A job can also be defined using YAML, as shown in the "YAML" tab.
54+
A job can be defined using YAML, as shown in the "YAML" tab.
14455
Here are some examples to load/save the YAML job configurations:
14556

14657
.. code-block:: python
@@ -160,7 +71,7 @@ Here are some examples to load/save the YAML job configurations:
16071
infrastructure:
16172
kind: infrastructure
16273
...
163-
"""")
74+
""")
16475
16576
The ``uri`` can be a local file path or a remote location supported by
16677
`fsspec <https://filesystem-spec.readthedocs.io/en/latest/>`_, including OCI object storage.
@@ -173,6 +84,8 @@ With the YAML file, you can create and run the job with ADS CLI:
17384
17485
For more details on ``ads opctl``, see :doc:`../cli/opctl/_template/jobs`.
17586

87+
The job infrastructure, runtime and job run also support YAML serialization/deserialization.
88+
17689

17790
Loading Existing Job or Job Run
17891
===============================
@@ -226,3 +139,20 @@ You can also cancel a job run:
226139
.. code-block:: python
227140
228141
run.cancel()
142+
143+
144+
Variable Substitution
145+
=====================
146+
147+
When defining a job or starting a job run,
148+
you can use environment variable substitution for the names and ``output_uri`` argument of
149+
the :py:meth:`~ads.jobs.PythonRuntime.with_output` method.
150+
151+
For example, the following job specifies the name based on the environment variable ``DATASET_NAME``,
152+
and ``output_uri`` based on the environment variables ``JOB_RUN_OCID``:
153+
154+
.. include:: ../jobs/tabs/name_substitution.rst
155+
156+
Note that ``JOB_RUN_OCID`` is an environment variable provided by the service after the job run is created.
157+
It is available for the ``output_uri`` but cannot be used in the job name.
158+
See also :ref:`Saving Outputs <_runtime_outputs>`

0 commit comments

Comments
 (0)