-
Notifications
You must be signed in to change notification settings - Fork 14k
Description
There are two "capacity"s for any successful allocation: the requested capacity and the returned capacity. The Allocator trait explicitly allows these to not be the same. However, it appears that RawVec assumes they are the same.
Everywhere RawVec request an allocation, it then sets its internal cap field for the capacity to be equal to the requested capacity, not the returned capacity The canonical example of this is here, when a new RawVec is constructed with a given capacity (though the same behaviour also applies when growing allocation). The comment even states
// Allocators currently return a
NonNull<[u8]>whose length
// matches the size requested. If that ever changes, the capacity
// here should change toptr.len() / mem::size_of::<T>().
which directly contradicts the docs for the Allocator trait.
In principle, we could design RawVec to only store and report the requested capacity and that could be an entirely sound design. However, I worry that the assumption embodied by this comment has wormed its way throughout the standard library and probably the ecosystem. For example, if breaking apart a Vec-based structure to send for FFI, the docs until recently strongly suggested that Vec::capacity would return the allocated capacity (which is only true if the allocator always returns the requested capacity). In #99790 the docs are currently being updated to reflect that this isn't the case.
Another example of where this is problematic is Vec::to_raw_parts which clearly assumes that RawVec::cap is the allocator returned capacity. I worry that this assumption is implicitly present through a lot of code.
I suggest therefore
- Fix
RawVecs allocation methods to store the capacity returned from the allocator, unfortunately requiring an extra division to do so - Review the documentation of
RawVecand structures that use it to ensure that it is clear thatcapacity()may return more than what was requested, and associated methods - Looking forward: discuss whether we should guarantee that
capacity()returns exactly the allocator returned capacity, and for which datastructures.
The alternative that I see is to instead commit to RawVec::cap being the requested capacity. In which case, it would be good to review where this may have been assumed elsewhere and what might need to be fixed. Happy to hear views on whether people think this would be a better path.