-
Notifications
You must be signed in to change notification settings - Fork 72
Description
Proposal
Context and motivation
Contracts have recently been introduced to the language, and are already successfully being used to specify the behaviour of select standard library functions (see rust-lang/rust#136578 and rust-lang/rust#147148). The intent is that these contracts can then be used both for verification and runtime testing purposes, although work to facilitate the former is still in progress.
While many Rust functions can be already specified today, for others the current specification language (Rust) is not expressive enough to capture their behaviour. This is in particular true of unsafe Rust code that manipulates memory directly via raw pointers, e.g. RawVec.
Safe rust has ownership reasoning built in, however, unsafe Rust leaves ownership properties to the programmer to ensure they hold in order for their program to be safe (i.e. not cause UB). By introducing separation logic pointer primitives, we can explicitly state the ownership requirements and guarantees of a function using unsafe, e.g. dealing with raw pointers.
This proposal is structured into the following sections:
- intro to
ownedandblockprimitives - intro to Fulminate-style separation logic assertions
- more detailed semantics of
ownedandblock - translating
ownedandblockto Rust verification tools - details of the execution model and interface of
ownedandblock
Basic ownership assertions: owned and block
To add basic separation logic support (for specifying pointer and memory manipulating unsafe Rust code), we propose introducing two primitive Fulminate-like assertions: owned and block.
ownedasserts exclusive ownership of memory pointed at by some raw pointer. It returns the pointed-to value of typeTfor further use in the contract.blockassert the necessary (but not sufficient) permission to free an allocated block of memory starting at a given memory address and of a certain size.
These assertions would be translatable into "points-to" and "block" in separation logic verification tools. The semantics of the two primitive pointer assertions will be discussed further in the following section of this proposal.
The rationale for opting for a Fulminate-inspired approach is that Fulminate was designed to enable concrete execution of separation logic specifications (originally for C code), which aligns with the intent of Rust contracts to be executable for testing purposes.
Fulminate
For a full understanding of the Fulminate approach, we recommend reading the paper by Banerjee et al.. The most important points from that paper relevant to this proposal are reiterated here:
- Fulminate acknowledges the benefits of having specifications suitable for both runtime testing and verification: "e.g. for quickly discovering some code and specification errors before embarking on proof, or for runtime testing of
remaining proof obligations". - Fulminate shows that inductively defined pointer data structure predicates from separation logic can be translated to an executable format, which lends itself nicely to expressing these predicates just as regular Rust code. This also means predicates are more likely to be readable by developers.
- Fulminate concretely defines and implements the executable semantics of CN specifications, including keeping track of ownership at runtime via ghost state (§2.3). We can learn from this approach when implementing runtime ownership tracking in Rust. In fact, Johannes Hostert has already implemented a prototype runtime ownership tracker in Miri. Implementing this in Miri seems natural as Miri already tracks extra state for its UB detection. As such, implementing runtime ownership tracking in the rust compiler would seem to be duplicated effort, and is therefore not part of this proposal.
- Fulminate claims their specification language is "a syntactic analogue of the input/output mode checks used in earlier systems such as VeriFast and Gillian". This implies a straightforward path of translating Fulminate-like specifications for tools such as VeriFast and Gillian, aligning with the goal of Rust contracts being compatible with existing verification tools (both VeriFast and Gillian support Rust).
Proposal differences from Fulminate
Our proposal differs from the original Fulminate approach in that whereas Fulminate uses condition lists in the requires and ensures closes, with an arbitrary number of ;-separated boolean constraints, we are sticking to vanilla Rust code, where only the final boolean expression is checked. In our proposal, only owned or block (or user-defined functions calling them) have additional effects and do not need to be in the final expression. It is also possible to use assert! as usual to fail a contract early (and can be called from e.g. a user-defined function).
E.g., in Fulminate
#[requires(b1; b2)]would translate to:
#[requires(b1 && b2)]in Rust contracts (where b1 and b2 are expressions of type bool).
Whereas Fulminate adds user-defined pure functions and resource-asserting predicates for use in the spec language, we use regular Rust functions.
Fulminate has built-in predicate for iterated separating conjunction, whereas in Rust we can simply use a for loop and call owned inside each iteration.
A future proposal may restrict the Rust specification language (e.g. in ways inspired by Verus), in which case the distinction between pure and resourceful predicates may be useful, but this is to be discussed in the future.
Examples
The first few examples are adapted from the fulminate paper to Rust syntax. The point of the examples is not necessarily to show idiomatic Rust code (e.g. safe Rust is sufficient to express zeroing an integer), but to show how separation logic contracts translate to Fulminate-like Rust contracts, and to be as close as possible to the Fulminate paper. We also give an example of using owned to specify the contract for std::ptr::write in the Rust stdlib.
The first function f asserts exclusive ownership of a memory location holding a pointer to an i32, of which we also have exclusive ownership. owned(p) asserts ownership of the inner pointer and reads its value into the contract variable r1. We then assert ownership of the value pointed to by r1, but discard its value as it's irrelevant to the specification of f. There is no pure-logical assertion required for the precondition so we return true at the end of requires. I.e. in order to call f, the calling context must have exclusive ownership of the value pointed to by p, as well as the i32 pointed-to indirectly. f is said to consume ownership of those resources in the precondition.
In the ensures clause, we assert that we are giving back the ownership of the pointer p to the calling context, as well as the inner i32. We also assert that the value pointed to by p did not change (r2 == r1), and that the i32 has been set to 0 (v2 == 0). I.e. the resources consumed by f in the precondition are logically returned to the caller, albeit in a modified state.
#[requires(
let r1 = owned(p);
owned(r1);
true
)]
#[ensures(move |_| {
let r2 = owned(p);
let v2 = owned(r2);
r2 == r1 && v2 == 0
})]
fn f(p: *mut *mut i32) {
unsafe {
let q = *p;
*q = 0;
}
}The corresponding Separation logic triple for f would be:
f(p)
Another feature of separation logic, is the ability to define inductively defined predicates to describe pointer data structures. Fulminate shows how they can be defined in an executable way, and as such, they can also be defined in Rust, using owned and block as primitives.
Below is a snippet from Johannes Hostert's Vec example, defining a logical/mathematical List (i.e. one in which we only care about its contained values, not about its internal structure). We use List to define an integer_list predicate taken from the Fulminate paper -- used to specify a raw-pointer based linked list data structure holding integers. integer_list is defined recursively, asserting ownership of each node in the linked list.
/// A simple linked list for use in specifications.
/// Ignore the `Box`, this is supposed to represent a functional/mathematical value
#[derive(PartialEq, Eq, Clone, Debug)]
pub enum ListB<T> {
Nil,
Cons(T, List<T>),
}
pub type List<T> = Box<ListB<T>>;
impl<T> ListB<T> {
/// Appends two list, returning the result.
pub fn append(mut self: Box<Self>, other: Box<Self>) -> Box<Self> {
let mut me = &mut self;
loop {
if matches!(&mut **me, ListB::Nil) {
*me = other;
break;
} else {
match &mut **me {
ListB::Nil => unreachable!(),
ListB::Cons(_, list_b) => me = list_b,
}
}
}
self
}
pub fn nil() -> List<T> {
Box::new(Self::Nil)
}
pub fn cons(hd: T, tl: List<T>) -> List<T> {
Box::new(Self::Cons(hd, tl))
}
}
struct Node {
x: i32,
next: *mut Node,
}
fn integer_list(p: *const Node) -> List<i32> {
if p.is_null() {
return ListB::nil();
} else {
let node = owned(p);
let tl = integer_list(node.next);
return ListB::cons(node.x, tl);
}
}We move onto an example for something we'd actually want to specify in the Rust standard library: ptr::write. In order to be able to write to dst, we need exclusive ownership of that location, but the memory does not necessarily need to be initialised. As such, we assert owned::<MaybeUninit<T>> on dst. Asserting owned::<T> could cause UB if the value dst points to was uninitialised (thus failing the precondition check when executed by Miri), and is too strong of a requirement for ptr::write.
In the postcondition, we now definitely have ownership of some valid T at dst, as those are the semantics of the write function, and we return that ownership to the calling context.
#[requires(owned::<MaybeUninit<T>>(dst as *const _); true)]
#[ensures(|_| { let v = owned::<T>(dst); v == src })]
pub const unsafe fn write<T>(dst: *mut T, src: T) {
// SAFETY: the caller must guarantee that `dst` is valid for writes.
// `dst` cannot overlap `src` because the caller has mutable access
// to `dst` while `src` is owned by this function.
unsafe {
intrinsics::write_via_move(dst, src)
}
}We would also like to specify that the value now pointed to by dst is equal to src. However, we can't currently do this because T does not necessarily implement PartialEq. Even with this limitation, one can at the very least assert ownership of some valid value of type T by the dst pointer when the function returns.
In the future, we could use snapshot equality from Prusti to express structural equality of two values of the same type in contracts, but adding this is out of scope of this proposal.
Alternatively, we may want to have specialised contracts based on trait bounds in the future to get around this issue.
Read-only permissions
How do we express read-only permissions, e.g. for ptr::read? This is an open question and out of scope of this proposal.
Semantics of owned and block
The biggest shift from existing contracts to also supporting fulminate-style contracts is that owned and block are not boolean predicates, and rather they assert ownership as an effect. Separating conjunction is analogous to fulminate's approach; since owned assertions are essentially computational effects, their separation is already built in, so simply executing the owned function asserts ownership. I.e. there is no need to return anything other than a pure (boolean) assertion from a contract. However, a failed ownership check should propagate and fail the contract. There are a few possibilities of concretely implementing this, which will be discussed in the "execution model" section.
It is worth noting that Fulminate supports a restricted fragment of separation logic, so it is likely that we won't be able to express all contracts that we would like using the proposed approach. However, by adding support for owned and block, we will be able to more easily identify how to bridge the gap towards supporting all the separation logic feature we realistically need.
Johannes Hostert has shown the approach is expressive enough to specify what it means to be a vector using separation logic, which makes the direction seem promising. As we write more contracts using owned and block, the limitation of the approach should become more apparent.
Assertions as resource passing
owned and block represent resources. Asserting them in the precondition indicates consumption of the resource by the current function, and asserting them in the postcondition indicates giving back the resources to the calling context.
For allocations, ownership of a new resource may be returned, and for deallocation, ownership of a resource may be consumed without return. Consuming a resource without return and deallocation indicates a resource (e.g. memory) leak, at least on the specification level.
owned
owned<T> is a typed points-to assertion from separation logic. It asserts that we have exclusive ownership of the pointed at memory, and returns the value of type T at that memory address. Because owned reads the pointed to memory, the memory must hold a valid value for the type T. (See uninitialised memory below as a special case of this), otherwise the assertion should fail.
Calling this with a zero-sized T is a memory no-op and the assertion will always succeed as long as T is inhabited.
If T is uninhabited, then the assertion should always fail, as there is no valid value that can be read.
Uninitialised memory
For (potentially) uninitialised memory owned::<MaybeUninit<T>> should be used. We leverage the semantics of MaybeUninit to assert ownership of a memory that we are not allowed to read (as reading it could cause UB).
block
block semantically represents a single heap allocation, with the pointer argument pointing to the start of the allocation. For arrays, a single
allocation may be capable of holding multiple contiguous values of type T, as such, block accepts both a type T and a count (in units of T) parameter.
For complete ownership, we would assert owned for every element in the array (initialised or not) individually, and additionally assert block once for the entire allocation. This allows us to keep track of the size of the allocation, as in general, we may at a given point only have ownership of a slice of the allocated memory.
Unlike owned, there is no requirement for the values in the block to be initialised. block does not read any memory, and does not return anything; it is just an assertion.
A count of 0 indicates no allocation. Assuming this is useful to avoid special cases, e.g. in empty vectors.
Because block represents a heap allocation, it also expresses the necessary permission to free that allocation. A heap allocation operation logically returns a block and (iterated) owned assertion in its postcondition, and a free operation logically consumes a block and (iterated) owned assertion in its precondition. This is true because the unit of free is the allocation. This gives nice intuitive symmetry on the semantics of block. Note in particular, that it is not sufficient to have just block or owned to free memory, rather we must have owned across the entire block, plus the block assertion.
Just like owned, the block permission is exclusive, i.e. cannot be freely duplicated. This should be fairly unsurprising, as it prevents double-frees on a specification level.
Pointers to stack variables would have owned but not block, as described by Bart Jacob's in his tutorial for Verifying purely unsafe Rust programs with VeriFast, as stack memory is not something explicitly passed to free.
Mixing raw pointers and references.
Functions that do not deal with raw pointers do not need to use owned or block in their specifications. However, many functions will transitively operate on raw points, i.e. function calls, and those require the correct ownership permissions in their preconditions, and as such must come from somewhere.
Functions that take &mut T references are guaranteed to have exclusive ownership of T upon function entry and exit, as such, we can assume an implicit owned(p as *const T) both in the precondition and postcondition, and this does not need to be manually specified by the contract author (although it may and is sound to do so). These implicit ownership resources can be used when calling functions that have explicit owned assertions in their requires and ensures clauses. Such functions must give back ownership upon returning, otherwise the calling function would not be able to keep its promise of returning the mutable borrow.
Take this simple (and again, unidiomatic) example:
#[requires(let init = owned(x); init < i32::MAX)]
#[ensures(move |_| { let v = owned(x); v == init + 1 })]
unsafe fn increment_raw(x: *mut i32) {
unsafe {
*x += 1;
}
}
#[requires(*x < i32::MAX)]
fn increment(x: &mut i32) {
// SAFETY: x is a reference, so we have exclusive ownership
unsafe { increment_raw(x as *mut _) };
}Here increment_raw requires the programmer to call to pass a pointer with ownership of an i32 so that the value can be updated in place. increment on the other hand, statically guarantees that the reference passed in as an argument has exclusive ownership of the i32. Ownership checking tools can use this to statically verify or dynamically check that the call to increment_raw is sound.
Note that there is currently no way to specify a safe implementation of increment without resorting to unsafe within the contract itself. This limitation should be addressed at some point in the future.
owned and block in Rust verification tools
To justify the addition of owned and block to the Rust's contract specification language, we describe how it can be directly used by Rust verification tools.
Only tools that support unsafe Rust are covered, as other tools do not need to concern themselves with pointer predicates.
Kani
In Kani, we have multiple predicates for reasoning about pointers.
They are weaker (logically) than the owned and block assertions, however, one can still translate stronger predicates into weaker ones. I.e. if you prove a contract with a weaker precondition, then you have proven a contract with a stronger precondition, as Hoare logic supports precondition strengthening.
-
owned->can_dereferencecan_dereferencerequires read and write permissions to a memory location. Having exclusive ownership of some location implies that we can read and write to it, so we can translate soundlyownedpredicates intocan_dereference. -
block->same_allocationThe correspondence here is a bit more tricky to reason about. Given we have a
block(ptr, size), then if pointersp1andp2are betweenptrandptr.offset(size), then they are definitely within the same allocation, asblockrepresents an entire allocation. Again, havingblockis stronger than just asserting thatp1andp2belong in the same allocation, as it also asserts the ownership (the right to free) of the block. -
A write-only permission (e.g. for uninitialised memory) would be required to implement the
can_writepredicate. This can be achieved by anowned::<MaybeUninit<T>>predicate call.
Gillian-Rust
-
owned-> typed points-to -
block-> (currently missing feature)
VeriFast
-
owned->points_to -
block->
alloc_blockchunkspred alloc_block(ptr: *u8; layout: Layout);
alloc_blockuses untyped pointers (*u8), and accepts an explicit layout parameter. The layout can be computed from the type ofblockand number of elements.Layout::from_size_align_(capacity * std::mem::size_of::<T>(), std::mem::align_of::<T>())
-
owned::<MaybeUninit<T>>->points_to_<T>
RefinedRust
owned-> ??? (I would appreciate the help from the authors of Refined Rust on identifying the correspondence. In general how useful/comptible wouldownedandblockbe with RefinedRust?)block->freeable
Execution model of owned and block
We propose that instead of implementing ownership tracking within Rust directly, we expose a way for interpreters (most notably Miri) to use the owned and block predicates. As such, during normal (contract) runtime execution, owned and block will be no-ops, and to avoid unsafely reading potentially uninitialised memory, they will simply return Error::CheckUnsupported. The contract author can then return early, most likely opting to skip the contract check (raise no pre/postcondition failure).
Currently, contracts permit arbitrary Rust code, including code with effects (such as calling assert! which can panic), as such adding support for an ownership effect is trivial (e.g. an ownership violation could panic at Miri runtime).
mod ownership {
type Result<T> = std::result::Result<T, Error>;
enum Error {
/// For when executing under a runtime that does not support runtime-ownership
/// tracking, such as the rustc compiler. If this error is returned to the
/// contract runtime, then the contract check will be marked as skipped
/// (i.e. will not cause a failure).
CheckUnsupported,
}
}When writing contracts, if the user propagates a CheckUnsupported we don't want the contract to fail, just because the runtime can't reasonably check its validity.
In order to make this feature ergonomic, we'd need to add a way for users to return early from contracts (currently the return keyword in requires returns form the entire function!), and ideally allow them to return a Result, so that the let v = owned(p)? pattern can be used. This does mean that the return type of requires and ensures clauses would have to be Result<bool>, so I'm open to suggestions on how to handle this to not make writing pure contracts cumbersome. We could:
- have a
try_requiresandtry_ensuresversion ofrequires/ensuresthat expectsResult<bool>instead ofboolas its result type. - do some preprocessing magic to wrap the final boolean expression in an
Ok. This is fine, asowned/blockassertions must come as statements preceding the final expression, so in practice the final boolean expression would always be wrapped inOk.
The alternative approach would be to have owned return the value pointed to value under rustc execution mode, however, this is likely undesirable, as it could lead to Undefined Behaviour from within a contract (e.g. if uninitialised memory is read). It could be argued that UB would likely happen anyway if we call a function without holding the proper ownership permissions. However, I would argue that owned sounds like an assertion of ownership, so that should be its primary semantics. And if we can't assert ownership, we should give up. Reading the value is secondary, and can always be done manually if desired, e.g. by overwriting the behaviour as follows:
fn owned_or_read<T>(ptr: *const T) -> T {
match owned(ptr) {
Ok(v) => v
Err(Error::CheckUnsupported) => unsafe { std::ptr::read(ptr) }
}
}As such, returning a Result gives the user more flexibility as to whether they want to continue executing a contract when ownership checks are unavailable or not.
ManuallyDrop
owned should return a ManuallyDrop<T> instead of plain T, as we probably don't want to call drop on values manipulated via a contract. This saves the contract user from having to manually wrap the values return from owned in ManuallyDrop::new.
Const functions
In order to specify functions marked as const (e.g. ptr::read as seen early), owned should also be marked as const.
There is no clear use case of having block marked as const, as heap allocations are not const-supported operations.
Function signatures
Putting the above together, we finally present the complete function signatures
for owned and block.
pub const fn owned<T>(ptr: *const T) -> Result<ManuallyDrop<T>>;
pub fn block<T>(ptr: *const T, count: usize) -> Result<()>;Miri implementation
Johannes Hostert has implemented a prototype ownership checker for contracts in Miri. It would expose Miri owned and block primitives to call, as well as precond and postcond marker functions, that need to be called upon entering requires and ensures respectively.
Contract lowering could automatically insert calls to precond and postcond, and the Miri owned and block primitives would be called when executed under Miri. We can check for Miri via cfg(miri), giving us the following possible implementation for owned:
pub const fn owned<T>(ptr: *const T) -> Result<ManuallyDrop<T>> {
#[cfg(miri)]
Ok(ManuallyDrop::new(miri_ownership::owned(ptr)))
#[cfg(not(miri))]
Err(Error::CheckUnsupported)
}Implementing it this way also means we can introduce owned and block to the specification language before the Miri ownership checking feature is ready for use.
owned and block outside of contracts
Since we don't differentiate between contract and non-contract code, potentially any function can be called within a contract. As such, it is fine to use owned and block from any Rust function.
However, we must decide what happens when a function is invoked that calls out to owned outside of a contract invocation context. Is this even detectable? If so, do we want to panic as to not encourage "regular" Rust code to call out to functions only meant for use in contracts?
panic vs Error for ownership violation
For simplicity, in the initial version of owned and block, the runtime ownership checker should panic if an ownership violation is detected.
If in the future we find a use case for requiring the ownership checker to return a Result, it can always be added in later. Especially since we are already using a Result for the case of unsupported runtime ownership checks, it will be as simple as adding an Error enum variant.
Conclusion
In summary, the concrete steps needed to implement this MCP:
- define
ownedandblockfunctions in thecore::contracts::ownershipnamespace. - adjust
requiresandensuresto handlecontracts::ownership::Errorand
permit early returns. - wire up
precond,postcond,ownedandblockfunctions from Miri once the
functionality is exposed.
Mentors or Reviewers
If you have a reviewer or mentor in mind for this work, mention them here. You can put your own name here if you are planning to mentor the work.
Process
The main points of the Major Change Process are as follows:
- File an issue describing the proposal.
- A compiler team member who is knowledgeable in the area can second by writing
@rustbot secondor kickoff a team FCP with@rfcbot fcp $RESOLUTION.- Refer to Proposals, Approvals and Stabilization docs for when a second is sufficient, or when a full team FCP is required.
- Once an MCP is seconded, the Final Comment Period begins.
- Final Comment Period lasts for 10 days after all outstanding concerns are solved.
- Outstanding concerns will block the Final Comment Period from finishing. Once all concerns are resolved, the 10 day countdown is restarted.
- If no concerns are raised after 10 days since the resolution of the last outstanding concern, the MCP is considered approved.
You can read more about Major Change Proposals on forge.