Skip to content

Conversation

@jpsamaroo
Copy link
Member

No description provided.

jpsamaroo and others added 12 commits November 11, 2025 11:50
This commit fixes two major issues in Datadeps that caused incorrect
results with views and ChunkView.

First, when presented with arguments that alias (such as an `Array` and
a view of that array), `generate_slot` would separately `move` these
values onto the destination processor (without considering how they
alias with each other), which could break the aliasing that they
previously had on their originating processor. This meant that certain
algorithms which used views together with the underlying arrays as
arguments would get incorrect results in a distributed setting, because
`generate_slot` would break aliasing and cause data to not be updated
correctly during copies.

This commit adds helpers (specifically `aliased_object!`) which allows
objects like views and `ChunkView` to declare the underlying parent
array as an object that may need to be tracked separately from the
surrounding structure; this helper keeps track of other such declared
objects that have been allocated on the destination processor, and
replaces the source object with the destination object during `move`.
By default, all arguments are now provided directly to `aliased_object!`
to perform this replacement, but this can be customized by overloading
`move_rewrap` (which `SubArray` and `ChunkView` now overload).

Secondly, even with objects now properly aliasing on remote processors,
Datadeps did not have a clear way to copy only the changed portions of
an argument. For example, when only a view of an array is updated on a
remote processor, and the next task will then need the full parent array
on the same remote processor, how does Datadeps copy over only the
portions of the parent array that aren't yet up-to-date on the remote?
The answer is that it didn't; it would do a full copy of the parent
array to the remote, which would then destroy the changes made to the
underlying view.

This commit overhauls the copying machinery to properly calculate this
difference (termed the "remainder"), based on the target ainfo and all
previously-updated ainfos, and schedules a "remainder copy" to copy only
the exact bytes that are not yet updated on the remote. Additionally, it
may schedule copies from multiple other remote processors to the
"target" remote processor as necessary, in case portions of an aliased
object exist on multiple distinct processors. This machinery is driven
by a new interval tree implementation, which allows efficient
calculation of differences between sets of memory spans, and uses
`unsafe_copyto!` to handle arbitrary data.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants