3838
3939
4040class FrameworkProfile :
41- """Configuration for the collection of framework metrics in the profiler.
41+ """
42+ Sets up the profiling configuration for framework metrics.
43+
44+ Validates user inputs and fills in default values if no input is provided.
45+ There are three main profiling options to choose from:
46+ :class:`~sagemaker.debugger.metrics_config.DetailedProfilingConfig`,
47+ :class:`~sagemaker.debugger.metrics_config.DataloaderProfilingConfig`, and
48+ :class:`~sagemaker.debugger.metrics_config.PythonProfilingConfig`.
49+
50+ The following list shows available scenarios of configuring the profiling options.
51+
52+ 1. None of the profiling configuration, step range, or time range is specified.
53+ SageMaker Debugger activates framework profiling based on the default settings
54+ of each profiling option.
55+
56+ .. code-block:: python
57+
58+ from sagemaker.debugger import ProfilerConfig, FrameworkProfile
59+
60+ profiler_config=ProfilerConfig(
61+ framework_profile_params=FrameworkProfile()
62+ )
63+
64+ 2. Target step or time range is specified to
65+ this :class:`~sagemaker.debugger.metrics_config.FrameworkProfile` class.
66+ The requested target step or time range setting propagates to all of
67+ the framework profiling options.
68+ For example, if you configure this class as following, all of the profiling options
69+ profiles the 6th step:
70+
71+ .. code-block:: python
72+
73+ from sagemaker.debugger import ProfilerConfig, FrameworkProfile
74+
75+ profiler_config=ProfilerConfig(
76+ framework_profile_params=FrameworkProfile(start_step=6, num_steps=1)
77+ )
78+
79+ 3. Individual profiling configurations are specified through
80+ the ``*_profiling_config`` parameters.
81+ SageMaker Debugger profiles framework metrics only for the specified profiling configurations.
82+ For example, if the :class:`~sagemaker.debugger.metrics_config.DetailedProfilingConfig` class
83+ is configured but not the other profiling options, Debugger only profiles based on the settings
84+ specified to the
85+ :class:`~sagemaker.debugger.metrics_config.DetailedProfilingConfig` class.
86+ For example, the following example shows a profiling configuration to perform
87+ detailed profiling at step 10, data loader profiling at step 9 and 10,
88+ and Python profiling at step 12.
89+
90+ .. code-block:: python
91+
92+ from sagemaker.debugger import ProfilerConfig, FrameworkProfile
93+
94+ profiler_config=ProfilerConfig(
95+ framework_profile_params=FrameworkProfile(
96+ detailed_profiling_config=DetailedProfilingConfig(start_step=10, num_steps=1),
97+ dataloader_profiling_config=DataloaderProfilingConfig(start_step=9, num_steps=2),
98+ python_profiling_config=PythonProfilingConfig(start_step=12, num_steps=1),
99+ )
100+ )
101+
102+ If the individual profiling configurations are specified in addition to
103+ the step or time range,
104+ SageMaker Debugger prioritizes the individual profiling configurations and ignores
105+ the step or time range. For example, in the following code,
106+ the ``start_step=1`` and ``num_steps=10`` will be ignored.
107+
108+ .. code-block:: python
109+
110+ from sagemaker.debugger import ProfilerConfig, FrameworkProfile
111+
112+ profiler_config=ProfilerConfig(
113+ framework_profile_params=FrameworkProfile(
114+ start_step=1,
115+ num_steps=10,
116+ detailed_profiling_config=DetailedProfilingConfig(start_step=10, num_steps=1),
117+ dataloader_profiling_config=DataloaderProfilingConfig(start_step=9, num_steps=2),
118+ python_profiling_config=PythonProfilingConfig(start_step=12, num_steps=1)
119+ )
120+ )
42121
43- Validates user input and fills in default values wherever necessary.
44122 """
45123
46124 def __init__ (
@@ -59,41 +137,34 @@ def __init__(
59137 start_unix_time = None ,
60138 duration = None ,
61139 ):
62- """Set up the profiling configuration for framework metrics based on user input.
63-
64- There are three main options for the user to choose from.
65- 1. No custom metrics configs or step range or time range specified. Default profiling is
66- done for each set of framework metrics.
67- 2. Custom metrics configs are specified. Do profiling for the metrics whose configs are
68- specified and no profiling for the rest of the metrics.
69- 3. Custom step range or time range is specified. Profiling for all of the metrics will
70- occur with the provided step/time range. Configs with additional parameters beyond
71- step/time range will use defaults for those additional parameters.
72-
73- If custom metrics configs are specified in addition to step or time range being specified,
74- then we ignore the step/time range and default to using custom metrics configs.
140+ """Initialize the FrameworkProfile class object.
75141
76142 Args:
77- local_path (str): The path where profiler events have to be saved.
78- file_max_size (int): Max size a trace file can be, before being rotated.
79- file_close_interval (float): Interval in seconds from the last close, before being
80- rotated.
81- file_open_fail_threshold (int): Number of times to attempt to open a trace fail before
82- marking the writer as unhealthy.
83143 detailed_profiling_config (DetailedProfilingConfig): The configuration for detailed
84- profiling done by the framework.
85- dataloader_profiling_config (DataloaderProfilingConfig): The configuration for metrics
86- collected in the data loader.
144+ profiling. Configure it using the
145+ :class:`~sagemaker.debugger.metrics_config.DetailedProfilingConfig` class.
146+ Pass ``DetailedProfilingConfig()`` to use the default configuration.
147+ dataloader_profiling_config (DataloaderProfilingConfig): The configuration for
148+ dataloader metrics profiling. Configure it using the
149+ :class:`~sagemaker.debugger.metrics_config.DataloaderProfilingConfig` class.
150+ Pass ``DataloaderProfilingConfig()`` to use the default configuration.
87151 python_profiling_config (PythonProfilingConfig): The configuration for stats
88152 collected by the Python profiler (cProfile or Pyinstrument).
89- horovod_profiling_config (HorovodProfilingConfig): The configuration for metrics
90- collected by horovod when using horovod for distributed training.
91- smdataparallel_profiling_config (SMDataParallelProfilingConfig): The configuration for
92- metrics collected by SageMaker Distributed training.
153+ Configure it using the
154+ :class:`~sagemaker.debugger.metrics_config.PythonProfilingConfig` class.
155+ Pass ``PythonProfilingConfig()`` to use the default configuration.
93156 start_step (int): The step at which to start profiling.
94157 num_steps (int): The number of steps to profile.
95- start_unix_time (int): The UNIX time at which to start profiling.
96- duration (float): The duration in seconds to profile for.
158+ start_unix_time (int): The Unix time at which to start profiling.
159+ duration (float): The duration in seconds to profile.
160+
161+ .. tip::
162+ Available profiling range parameter pairs are
163+ (**start_step** and **num_steps**) and (**start_unix_time** and **duration**).
164+ The two parameter pairs are mutually exclusive, and this class validates
165+ if one of the two pairs is used. If both pairs are specified, a
166+ conflict error occurs.
167+
97168 """
98169 self .profiling_parameters = {}
99170 self ._use_default_metrics_configs = False
@@ -132,6 +203,7 @@ def _process_trace_file_parameters(
132203 rotated.
133204 file_open_fail_threshold (int): Number of times to attempt to open a trace fail before
134205 marking the writer as unhealthy.
206+
135207 """
136208 assert isinstance (local_path , str ), ErrorMessages .INVALID_LOCAL_PATH .value
137209 assert (
@@ -152,13 +224,17 @@ def _process_trace_file_parameters(
152224 def _process_metrics_configs (self , * metrics_configs ):
153225 """Helper function to validate and set the provided metrics_configs.
154226
155- In this case, the user specifies configs for the metrics they want profiled.
156- Profiling does not occur for metrics if configs are not specified for them.
227+ In this case,
228+ the user specifies configurations for the metrics they want to profile.
229+ Profiling does not occur
230+ for metrics if the configurations are not specified for them.
157231
158232 Args:
159233 metrics_configs: The list of metrics configs specified by the user.
234+
160235 Returns:
161- bool: Whether custom metrics configs will be used for profiling.
236+ bool: Indicates whether custom metrics configs will be used for profiling.
237+
162238 """
163239 metrics_configs = [config for config in metrics_configs if config is not None ]
164240 if len (metrics_configs ) == 0 :
@@ -173,16 +249,19 @@ def _process_metrics_configs(self, *metrics_configs):
173249 def _process_range_fields (self , start_step , num_steps , start_unix_time , duration ):
174250 """Helper function to validate and set the provided range fields.
175251
176- Profiling will occur for all of the metrics using these fields as the specified
177- range and default parameters for the rest of the config fields (if necessary).
252+ Profiling occurs
253+ for all of the metrics using these fields as the specified range and default parameters
254+ for the rest of the configuration fields (if necessary).
178255
179256 Args:
180257 start_step (int): The step at which to start profiling.
181258 num_steps (int): The number of steps to profile.
182259 start_unix_time (int): The UNIX time at which to start profiling.
183- duration (float): The duration in seconds to profile for.
260+ duration (float): The duration in seconds to profile.
261+
184262 Returns:
185- bool: Whether custom step or time range will be used for profiling.
263+ bool: Indicates whether a custom step or time range will be used for profiling.
264+
186265 """
187266 if start_step is num_steps is start_unix_time is duration is None :
188267 return False
0 commit comments