Skip to content

Commit 413a3a9

Browse files
committed
Update run_notebook.rst
1 parent 626d1b5 commit 413a3a9

File tree

2 files changed

+121
-137
lines changed

2 files changed

+121
-137
lines changed
Lines changed: 36 additions & 137 deletions
Original file line numberDiff line numberDiff line change
@@ -1,154 +1,53 @@
1-
.. _job_run_a_notebook:
2-
31
Run a Notebook
42
**************
53

6-
In some cases, you may want to run an existing JupyterLab notebook as a job. You can do this using the ``NotebookRuntime()`` object.
7-
8-
The next example shows you how to run an the `TensorFlow 2 quick start for beginner <https://github.com/tensorflow/docs/blob/master/site/en/tutorials/quickstart/beginner.ipynb>`__ notebook from the internet and save the results to OCI Object Storage. The notebook path points to the raw file link from GitHub. To run the following example, ensure that you have internet access to retrieve the notebook:
9-
10-
Python
11-
======
12-
13-
.. code-block:: python3
14-
15-
from ads.jobs import Job, DataScienceJob, NotebookRuntime
16-
17-
job = (
18-
Job()
19-
.with_infrastructure(
20-
DataScienceJob()
21-
.with_log_group_id("<log_group_ocid>")
22-
.with_log_id("<log_ocid>")
23-
# The following infrastructure configurations are optional
24-
# if you are in an OCI data science notebook session.
25-
# The configurations of the notebook session will be used as defaults
26-
.with_compartment_id("<compartment_ocid>")
27-
.with_project_id("<project_ocid>")
28-
.with_subnet_id("<subnet_ocid>")
29-
.with_shape_name("VM.Standard.E3.Flex")
30-
.with_shape_config_details(memory_in_gbs=16, ocpus=1) # Applicable only for the flexible shapes
31-
.with_block_storage_size(50)
32-
)
33-
.with_runtime(
34-
NotebookRuntime()
35-
.with_notebook(
36-
path="https://raw.githubusercontent.com/tensorflow/docs/master/site/en/tutorials/customization/basics.ipynb",
37-
encoding='utf-8'
38-
)
39-
.with_service_conda("tensorflow28_p38_cpu_v1")
40-
.with_environment_variable(GREETINGS="Welcome to OCI Data Science")
41-
.with_output("oci://bucket_name@namespace/path/to/dir")
42-
)
43-
)
44-
45-
job.create()
46-
run = job.run().watch()
47-
48-
After the notebook finishes running, the notebook with results are saved to ``oci://bucket_name@namespace/path/to/dir``. You can download the output by calling the ``download()`` method.
4+
The :py:class:`~ads.jobs.NotebookRuntime` allows you to run a single Jupyter notebook as a job.
495

50-
.. code-block:: python3
6+
If your notebook needs extra dependencies like custom module or data files, you can use
7+
:py:class:`~ads.jobs.PythonRuntime` or :py:class:`~ads.jobs.GitPythonRuntime` and set your notebook as the entrypoint.
518

52-
run.download("/path/to/local/dir")
9+
See also:
5310

54-
The ``NotebookRuntime`` also allows you to use exclusion tags, which lets you exclude cells from a job run. For example, you could use these tags to do exploratory data analysis, and then train and evaluate your model in a notebook. Then you could use that same notebook to only build future models that are trained on a different dataset. So the job run only has to execute the cells that are related to training the model, and not the exploratory data analysis or model evaluation.
11+
* :doc:`run_python`
12+
* :doc:`run_git`
5513

56-
You tag the cells in the notebook, and then specify the tags using the ``.with_exclude_tag()`` method. Cells with any matching tags are excluded from the job run. For example, if you tagged cells with ``ignore`` and ``remove``, you can pass in a list of the two tags to the method and those cells are excluded from the code that is executed as part of the job run. To tag cells in a notebook, see `Adding tags using notebook interfaces <https://jupyterbook.org/content/metadata.html#adding-tags-using-notebook-interfaces>`__.
14+
TensorFlow Example
15+
==================
5716

58-
.. code-block:: python3
17+
The following example shows you how to run an the
18+
`TensorFlow 2 quick start for beginner
19+
<https://github.com/tensorflow/docs/blob/master/site/en/tutorials/quickstart/beginner.ipynb>`_
20+
notebook from the internet and save the results to OCI Object Storage.
21+
The notebook path points to the raw file link from GitHub.
22+
To run the example, ensure that you have internet access to retrieve the notebook:
5923

60-
job.with_runtime(
61-
NotebookRuntime()
62-
.with_notebook("path/to/notebook")
63-
.with_exclude_tag(["ignore", "remove"])
64-
)
24+
.. include:: ../jobs/tabs/notebook_runtime.rst
6525

66-
YAML
67-
====
26+
Working Directory
27+
=================
6828

69-
You could use the following YAML to create the job:
29+
An empty directory in the job run will be created as the working directory for running the notebook.
30+
All relative paths used in the notebook will be base on the working directory.
7031

71-
.. code-block:: yaml
32+
Download the Outputs
33+
====================
7234

73-
kind: job
74-
spec:
75-
infrastructure:
76-
kind: infrastructure
77-
type: dataScienceJob
78-
spec:
79-
jobInfrastructureType: STANDALONE
80-
jobType: DEFAULT
81-
logGroupId: <log_group_id>
82-
logId: <log.id>
83-
runtime:
84-
kind: runtime
85-
type: notebook
86-
spec:
87-
notebookPathURI: /path/to/notebook
88-
conda:
89-
slug: tensorflow28_p38_cpu_v1
90-
type: service
35+
If you specify the output location using :py:meth:`~ads.jobs.NotebookRuntime.with_output`.
36+
All files in the working directory, including the notebook with outputs,
37+
will be saved to output location (``oci://bucket_name@namespace/path/to/dir``) after the job finishes running.
38+
You can download the output by calling the :py:meth:`~ads.jobs.NotebookRuntime.download` method.
9139

92-
**NotebookRuntime Schema**
40+
Exclude Cells
41+
=============
9342

94-
.. code-block:: yaml
43+
The :py:class:`~ads.jobs.NotebookRuntime` also allows you to specify tags to exclude cells from being processed
44+
in a job run using :py:meth:`~ads.jobs.NotebookRuntime.with_exclude_tag` method.
45+
For example, you could do exploratory data analysis and visualization in a notebook,
46+
and you may want to exclude the visualization when running the notebook in a job.
9547

96-
kind:
97-
required: true
98-
type: string
99-
allowed:
100-
- runtime
101-
type:
102-
required: true
103-
type: string
104-
allowed:
105-
- notebook
106-
spec:
107-
required: true
108-
type: dict
109-
schema:
110-
excludeTags:
111-
required: false
112-
type: list
113-
notebookPathURI:
114-
required: false
115-
type: string
116-
notebookEncoding:
117-
required: false
118-
type: string
119-
outputUri:
120-
required: false
121-
type: string
122-
args:
123-
nullable: true
124-
required: false
125-
type: list
126-
schema:
127-
type: string
128-
conda:
129-
nullable: false
130-
required: false
131-
type: dict
132-
schema:
133-
slug:
134-
required: true
135-
type: string
136-
type:
137-
required: true
138-
type: string
139-
allowed:
140-
- service
141-
env:
142-
nullable: true
143-
required: false
144-
type: list
145-
schema:
146-
type: dict
147-
schema:
148-
name:
149-
type: string
150-
value:
151-
type:
152-
- number
153-
- string
48+
To tag cells in a notebook, see
49+
`Adding tags using notebook interfaces <https://jupyterbook.org/content/metadata.html#adding-tags-using-notebook-interfaces>`__.
15450

51+
The :py:meth:`~ads.jobs.NotebookRuntime.with_exclude_tag` take a list of tags as argument
52+
Cells with any matching tags are excluded from the job run.
53+
In the above example, cells with ``ignore`` or ``remove`` are excluded.
Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
.. tabs::
2+
3+
.. code-tab:: python
4+
:caption: Python
5+
6+
from ads.jobs import Job, DataScienceJob, NotebookRuntime
7+
8+
job = (
9+
Job(name="My Job")
10+
.with_infrastructure(
11+
DataScienceJob()
12+
.with_log_group_id("<log_group_ocid>")
13+
.with_log_id("<log_ocid>")
14+
# The following infrastructure configurations are optional
15+
# if you are in an OCI data science notebook session.
16+
# The configurations of the notebook session will be used as defaults.
17+
.with_compartment_id("<compartment_ocid>")
18+
.with_project_id("<project_ocid>")
19+
# For default networking, no need to specify subnet ID
20+
.with_subnet_id("<subnet_ocid>")
21+
.with_shape_name("VM.Standard.E3.Flex")
22+
# Shape config details are applicable only for the flexible shapes.
23+
.with_shape_config_details(memory_in_gbs=16, ocpus=1)
24+
.with_block_storage_size(50)
25+
)
26+
.with_runtime(
27+
NotebookRuntime()
28+
.with_notebook(
29+
path="https://raw.githubusercontent.com/tensorflow/docs/master/site/en/tutorials/customization/basics.ipynb",
30+
encoding='utf-8'
31+
)
32+
.with_service_conda("tensorflow28_p38_cpu_v1")
33+
.with_environment_variable(GREETINGS="Welcome to OCI Data Science")
34+
.with_exclude_tag(["ignore", "remove"])
35+
.with_output("oci://bucket_name@namespace/path/to/dir")
36+
)
37+
)
38+
39+
# Create the job on OCI Data Science
40+
job.create()
41+
# Start a job run
42+
run = job.run()
43+
# Stream the job run outputs
44+
run.watch()
45+
# Download the notebook back to local
46+
run.download("/path/to/local/dir")
47+
48+
.. code-tab:: yaml
49+
:caption: YAML
50+
51+
kind: job
52+
spec:
53+
name: "My Job"
54+
infrastructure:
55+
kind: infrastructure
56+
type: dataScienceJob
57+
spec:
58+
blockStorageSize: 50
59+
compartmentId: <compartment_ocid>
60+
jobInfrastructureType: STANDALONE
61+
jobType: DEFAULT
62+
logGroupId: <log_group_ocid>
63+
logId: <log_ocid>
64+
projectId: <project_ocid>
65+
shapeConfigDetails:
66+
memoryInGBs: 16
67+
ocpus: 1
68+
shapeName: VM.Standard.E3.Flex
69+
subnetId: <subnet_ocid>
70+
runtime:
71+
kind: runtime
72+
type: notebook
73+
spec:
74+
conda:
75+
slug: tensorflow28_p38_cpu_v1
76+
type: service
77+
env:
78+
- name: GREETINGS
79+
value: Welcome to OCI Data Science
80+
excludeTags:
81+
- ignore
82+
- remove
83+
notebookEncoding: utf-8
84+
notebookPathURI: https://raw.githubusercontent.com/tensorflow/docs/master/site/en/tutorials/customization/basics.ipynb
85+
outputUri: oci://bucket_name@namespace/path/to/dir

0 commit comments

Comments
 (0)