-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Contiguous access #21984
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Contiguous access #21984
Conversation
|
This pr just enables slices from tables to be returned directly when applicable, it doesn't implement any batches and it doesn't ensure any specific (other than rust's) alignment (yet these slices may be used to apply simd).
This pr doesn't deal with any alignments but (as of my understanding) you can always take sub-slices which would meet your alignment requirements. And just referring to the issue #21861, even without any specific alignment the code gets vectorized.
No, the returned slices do not have any specific (other than rust's) alignment requirements. |
|
The solution looks promising to solve issue #21861. If you want to use SIMD instructions explicitly, alignment is something you usually have to manage yourself (with an aligned allocator or a peeled prologue). Auto-vectorization won’t “update” the alignment for you – it just uses whatever alignment it can prove and otherwise emits unaligned loads. From that perspective, a contiguous slice is already sufficient; fully aligned SIMD is a separate concern on top of that. |
hymm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a full review, but onboard with the general approach in this pr. Overall this is fairly straightforward. I imagine we'll eventually want to have some simd aligned storage, but in the meantime users can probably align their components manually.
|
You added a new example but didn't add metadata for it. Please update the root Cargo.toml file. |
|
It looks like your PR has been selected for a highlight in the next release blog post, but you didn't provide a release note. Please review the instructions for writing release notes, then expand or revise the content in the release notes directory to showcase your changes. |
hymm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a review of the changes to ThinSlicePtr. I'll try to review more later.
| /// | ||
| /// # Safety | ||
| /// | ||
| /// `len` must be less or equal to the length of the slice. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// `len` must be less or equal to the length of the slice. | |
| /// `len` must be less than or equal to the length of the slice. |
| #[cfg(debug_assertions)] | ||
| assert!(len <= self.len, "tried to create an out-of-bounds slice"); | ||
|
|
||
| // SAFETY: The caller guarantees `len` is not greater than the length of the slice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this safety comment you should also be asserting why all of these bullets points are met for from_raw_parts. The caller is only covering for the last bullet point. https://doc.rust-lang.org/std/slice/fn.from_raw_parts.html#safety
data must be non-null, valid for reads for len * size_of::() many bytes, and it must be properly aligned.
data must point to len consecutive properly initialized values of type T.
The memory referenced by the returned slice must not be mutated for the duration of lifetime 'a, except inside an UnsafeCell.
The total size len * size_of::() of the slice must be no larger than isize::MAX, and adding that size to data must not “wrap around” the address space. See the safety documentation of pointer::offset.
| } | ||
|
|
||
| /// Casts the slice to another type | ||
| pub fn cast<U>(&self) -> ThinSlicePtr<'a, U> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure this needs to be unsafe. This has a safety requirement that T and U have the same layout and valid bit representations.
| /// # Safety | ||
| /// | ||
| /// - There must not be any aliases to the slice | ||
| /// - `len` must be less or equal to the length of the slice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this sound? we're basically converting a &[UnsafeCell<T>] to a &mut [T]. Feels like that shouldn't be sound.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any tests that mutate the change ticks? I would expect miri to catch this if we do.
Objective
Enables accessing slices from tables directly via Queries.
Fixes: #21861
Solution
One new trait:
ContiguousQueryDataallows to fetch all values from tables all at once (an implementation for&Treturns a slice of components in the set table, for&mut Treturns a mutable slice of components in the set table as well as a struct with methods to set update ticks (to match thefetchimplementation))A method
as_contiguous_iterinQueryItermaking possible to iterate using these traits.Macro
QueryDatawas updated to support contiguous items whencontiguous(target)attribute is added (a target can beall,mutableandimmutable, refer to thecustom_query_paramexample)Testing
sparse_set_contiguous_querytest verifies that you can't usenext_contiguouswith sparse set componentstest_contiguous_query_datatest verifies that returned values are validbase_contiguousbenchmark (file is namediter_simple_contiguous.rs)base_no_detectionbenchmark (file is namediter_simple_no_detection.rs)base_no_detection_contiguousbenchmark (file is namediter_simple_no_detection_contiguous.rs)base_contiguous_avx2benchmark (file is namediter_simple_contiguous_avx2.rs)Showcase
Examples
contiguous_query,custom_query_paramExample
Benchmarks
Code for
basebenchmark:Iterating over 10000 entities from one table and increasing a 3-dimensional vector from component
Positionby a 3-dimensional vector from componentVelocitybypass_change_detection()methodUsing contiguous 'iterator' makes the program a little bit faster and it can be further vectorized to make it even faster
Things to think about
offsetparameter inContiguousQueryData