You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# handle SupervisionEvents from remote actor failures
218
224
pass
219
225
220
226
Remote actor endpoints can also utilize Python native breakpoints, enabling interactive debugging sessions.
221
-
For a complete deep-dive into Monarch debuggers, `refer to the documentation <https://meta-pytorch.org/monarch/generated/examples/debugging.html>`_.
227
+
For a complete deep-dive into Monarch debuggers, please `refer to the documentation <https://meta-pytorch.org/monarch/generated/examples/debugging.html>`_.
222
228
223
229
.. code-block:: python
224
230
@@ -266,7 +272,7 @@ and some of the training hyperparameters.
266
272
gpus_per_node: int=8
267
273
268
274
TorchTitan uses a JobConfig object to control all aspects of training.
269
-
Here we create a function that builds this configuration from our RunParams.
275
+
Here we create a function that parses this configuration from our RunParams.
270
276
271
277
.. code-block:: python
272
278
@@ -338,14 +344,13 @@ This is where Monarch's power becomes most apparent.
338
344
try:
339
345
# 1. Create a SLURM job with N nodes
340
346
# This leverages Monarch to reserve a persistent machine allocation
0 commit comments