@@ -149,6 +149,17 @@ defmodule EXLA do
149149 To increase the stack size of dirty IO threads from 40 kilowords to
150150 128 kilowords. In a release, you can set this flag in your `vm.args`.
151151
152+ ## Distribution
153+
154+ EXLA allows its tensors to be sent across nodes, as long as the parent
155+ node (which effectively holds the tensor) keeps a reference to the
156+ tensor while it is read by any other node it was sent to.
157+
158+ The result of `EXLA.compile/3` can also be shared across nodes.
159+ On invocation, the underlying executable is automatically serialized
160+ and sent to other nodes, without requiring a full recompilation,
161+ as long as the same conditions as above apply.
162+
152163 ## Docker considerations
153164
154165 EXLA should run fine on Docker with one important consideration:
@@ -274,11 +285,11 @@ defmodule EXLA do
274285 [2, 4, 6]
275286 >
276287
277- Results are allocated on the `EXLA.Backend`. Note that the
278- `EXLA.Backend` is asynchronous: operations on its tensors
279- *may* return immediately, before the tensor data is available.
280- The backend will then block only when trying to read the data
281- or when passing it to another operation .
288+ The returned function can be sent across nodes, as long as the parent
289+ node (which effectively holds the function) keeps a reference to the
290+ function while it is invoked by any other node it was sent to. On
291+ invocation, the underlying executable is automatically serialized
292+ and sent to other nodes, without requiring a full recompilation .
282293
283294 See `jit/2` for supported options.
284295 """
0 commit comments