-
Notifications
You must be signed in to change notification settings - Fork 1.6k
RFC: No (opsem) Magic Boxes #3712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 4 commits
43eb657
266f159
f5bcb71
3dcbc86
f3ad832
d8fe520
d66fd0c
4f4fa44
751493c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,115 @@ | ||||||
| # RFC: No (opsem) Magic Boxes | ||||||
|
|
||||||
| - Feature Name: `box_yesalias` | ||||||
| - Start Date: (fill me in with today's date, YYYY-MM-DD) | ||||||
| - RFC PR: [rust-lang/rfcs#3712](https://github.com/rust-lang/rfcs/pull/3712) | ||||||
| - Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) | ||||||
|
|
||||||
| # Summary | ||||||
| [summary]: #summary | ||||||
|
|
||||||
| Currently, the operational semantics of the type [`alloc::boxed::Box<T>`](https://doc.rust-lang.org/beta/alloc/boxed/struct.Box.html) is in dispute, but the compiler adds llvm `noalias` to it. To support it, the current operational semantics models have the type use a special form of the `Unique` (Stacked Borrows) or `Active` (Tree Borrows) tag, which has aliasing implications, validity implications, and also presents some unique complications in the model and in improvements to the type (e.g. Custom Allocators). We propose that, for the purposes of the runtime semantics of Rust, `Box` is treated as no more special than a user-defined smart pointer you can write today[^1]. In particular, it is given similar behaviour on a typed copy to a raw pointer. | ||||||
|
|
||||||
| [^1]: We maintain some trivial validity invariants (such as alignment and address space limits) that a user cannot define, but these invariants only depend upon the value of the `Box` itself, rather than on memory pointed to by the `Box`. | ||||||
|
|
||||||
| # Motivation | ||||||
| [motivation]: #motivation | ||||||
|
|
||||||
| The current behaviour of [`alloc::boxed::Box<T>`] can be suprising, both to unsafe code, and to people working on the language or the compiler. In many respects, `Box<T>` is treated incredibly specially by `rustc` and by Rust, leading to ICEs or unsoundness arising from reasonable changes, such as the introduction of per-container `Allocator`s. | ||||||
|
|
||||||
| In the past, the operational semantics team has considered many ad-hoc solutions to the problem, while maintaining special cases in the aliasing model (such as Weak Protectors) that only exist for `Box<T>`. | ||||||
| For example, moving a `ManuallyDrop<Box<T>>` after calling `Drop` is immediate undefined behaviour (due to the `Box` no longer being dereferenceable) - <https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/Moving.20.60ManuallyDrop.3CBox.3C_.3E.3E.60>, and the Active tag requirements for a `Box<T>` are unsound when combined with custom allocators <https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Is.20Box.20a.20language.20type.20or.20a.20library.20type.3F>. This wastes procedural time reviewing the proposals, and complicates the language by introducing special case rules that would otherwise be unnecessary. | ||||||
|
|
||||||
| Any `unsafe` code that may want to temporarily maintain aliased `Box<T>`s for various reasons (such as low-level copy operations), or may want to use something like `ManuallyDrop<Box<T>>`, is put into an interesting position: While they can use a user-defined smart pointer, this requires both care on the part of the smart pointer implementor, but also affects the ergonomics and expressiveness of that code, as `Box<T>` has many special language features that surface at the syntax level, which cannot be replicated today by a user-defined type. | ||||||
|
|
||||||
| # Guide-level explanation | ||||||
| [guide-level-explanation]: #guide-level-explanation | ||||||
|
|
||||||
| [`alloc::boxed::Box<T>`] is defined with a number of magic features that interact directly with the language. To limit the impact, both on users and on the language itself, we remove the language level requirement that `Box` be dereferenceable and unique (like applies to `&mut T`). | ||||||
| This does not affect library level requirements of uniqueness however - it may remain undefined behaviour to access a `Box` that is being aliased (particularily where the operations produce multiple aliasing mutable references from different `Box`es) or to use `Box::from_raw` to construct multiple aliasing boxes. | ||||||
|
|
||||||
| We also do not remove any other language level magic from `Box`, such as the ability to do both partial and complete moves from a `Box`. | ||||||
|
|
||||||
| # Reference-level explanation | ||||||
| [reference-level-explanation]: #reference-level-explanation | ||||||
|
|
||||||
| For the remainder of this section, let `WellFormed<T>` designate a type for *exposition purposes only* defined as follows: | ||||||
| ```rust | ||||||
| #[repr(transparent)] | ||||||
| struct WellFormed<T: ?Sized>(core::ptr::NonNull<T>); | ||||||
| ``` | ||||||
|
Comment on lines
+40
to
+47
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are there any differences semantically between what is proposed for That is, since we already accepted RFC 3336, if there is no daylight between these, then it should say that normatively, and it perhaps could even lean into that for defining the semantics. cc @RalfJung
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We haven't spelled out whether
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Restating Ralf's answer to this, However, this also spells out that validity invariant in full, which we cannot rely on existing types yet to do. |
||||||
|
|
||||||
| (Note that we do not define this type in the public standard library interface, though an implementation of the standard library could define the type locally) | ||||||
|
|
||||||
| The following are not valid values of type `WellFormed<T>`, and a typed copy that would produce such a value is undefined behaviour: | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The Reference has been adjusted a while ago to state validity invariants positively, i..e by listing what must be true, instead of listing what must not be false. IMO that's more understandable, and the RFC should be updated to also do that. |
||||||
| * Any value that is invalid for a raw pointer to `T` (e.g. a value read from uninitialized memory, or a fat pointer with invalid metadata) | ||||||
| * A null pointer (even when `T` is a Zero-sized type) | ||||||
| * A pointer with an address that is not well-aligned for `T` (or in the case of a DST, the `align_of_val_raw` of the value), or | ||||||
| * A pointer with an address that offsetting that address (as though by `.wrapping_byte_offset`) by `size_of_val_raw` bytes would wrap arround the address space | ||||||
|
|
||||||
| The [`alloc::boxed::Box<T>`] type shall be laid out as though a `repr(transparent)` struct containing a field of type `WellFormed<T>`. The behaviour of doing a typed copy as type [`alloc::boxed::Box<T>`] shall be the same as though a typed copy of the struct `#[repr(transparent)] struct Box<T>(WellFormed<T>);`. | ||||||
|
||||||
|
|
||||||
| [`alloc::boxed::Box<T>`] shall have a niche of the all zeroes bit pattern, which is used for the `None` value of [`core::option::Option<Box<T>>`] (and similar types). Additional invalid values may be used as niches, but no guarantees are made about those niches. | ||||||
|
|
||||||
| When the unstable feature [`allocator_api`] is in use, the type [`alloc::boxed::Box<T,A>`] (where `A` is not the type `Global`, `alloc::boxed::Box<T,Global>` is the same type as [`alloc::boxed::Box<T>`]) is laid out as a struct containing a field of type `WellFormed<T>`, and a field of type `A`, and a typed copy as [`alloc::boxed::Box<T,A>`] is the same as a typed copy of that struct. The order and offsets of these fields is not specified, even for an `A` of size 0 and alignment 1. | ||||||
|
|
||||||
| A value of type [`alloc::boxed::Box<T,A>`] is invalid if either field is invalid. | ||||||
|
|
||||||
| # Drawbacks | ||||||
| [drawbacks]: #drawbacks | ||||||
|
|
||||||
| This prohibits the compiler from directly optimizing usage of `Box<T>` via the llvm `noalias` attribute or other similar optimization attributes. | ||||||
| This precludes optimizations made both today, and in the future (both in advancements made on the `rustc` compiler, including via llvm, and on other alternative compilers). | ||||||
|
|
||||||
| However, past performance benchmarks have shown little to no performance is obtained using current optimizations from attaching `noalias` to `Box` in either position. Many future performance gains that are precluded by this RFC can likely be restored by more granular emission of `noalias` and other optimization attributes for shared and mutable references | ||||||
|
|
||||||
| # Rationale and alternatives | ||||||
| [rationale-and-alternatives]: #rationale-and-alternatives | ||||||
|
|
||||||
| - Alternative 1: Status Quo | ||||||
| - While the easiest alternative is to do nothing and maintain the status quo, as mentioned this has suprisingly implications both for the operational semantics of Rust | ||||||
| - Alternative 2: Introduce a new type `AlisableBox<T>` which has the same interface as `Box<T>` but lacks the opsem implications that `Box<T>` has. | ||||||
|
||||||
| - Alternative 2: Introduce a new type `AlisableBox<T>` which has the same interface as `Box<T>` but lacks the opsem implications that `Box<T>` has. | |
| - Alternative 2: Introduce a new type `AliasableBox<T>` which has the same interface as `Box<T>` but lacks the opsem implications that `Box<T>` has. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably we should mention somewhere, in favor of adding a separate type like AliasableBox, that if we did that, we could then align other operations on that type to match in a way we can't do with Box. E.g., making a reference from an AliasableBox would be unsafe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually the status quo, since rust-lang/rust#122018
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This proposed RFC kind of handwaves over what the opsem complications are that are motivating this. And for such a consequential RFC, the motivation section is rather short overall.
It'd be helpful for this motivating background context to be inlined and explored more deeply here, and for it to go into what the status of the various solutions to these problems might be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E.g., it'd be good to include the substance of @RalfJung's comment here somewhere: