@@ -27,6 +27,7 @@ These docs are long. Search for the section you are interested in.
2727- Formal model
2828- Borrowing and loans
2929- Moves and initialization
30+ - Drop flags and structural fragments
3031- Future work
3132
3233# Overview
@@ -1019,6 +1020,175 @@ walk back over, identify all uses, assignments, and captures, and
10191020check that they are legal given the set of dataflow bits we have
10201021computed for that program point.
10211022
1023+ # Drop flags and structural fragments
1024+
1025+ In addition to the job of enforcing memory safety, the borrow checker
1026+ code is also responsible for identifying the *structural fragments* of
1027+ data in the function, to support out-of-band dynamic drop flags
1028+ allocated on the stack. (For background, see [RFC PR #320].)
1029+
1030+ [RFC PR #320]: https://github.com/rust-lang/rfcs/pull/320
1031+
1032+ Semantically, each piece of data that has a destructor may need a
1033+ boolean flag to indicate whether or not its destructor has been run
1034+ yet. However, in many cases there is no need to actually maintain such
1035+ a flag: It can be apparent from the code itself that a given path is
1036+ always initialized (or always deinitialized) when control reaches the
1037+ end of its owner's scope, and thus we can unconditionally emit (or
1038+ not) the destructor invocation for that path.
1039+
1040+ A simple example of this is the following:
1041+
1042+ ```rust
1043+ struct D { p: int }
1044+ impl D { fn new(x: int) -> D { ... }
1045+ impl Drop for D { ... }
1046+
1047+ fn foo(a: D, b: D, t: || -> bool) {
1048+ let c: D;
1049+ let d: D;
1050+ if t() { c = b; }
1051+ }
1052+ ```
1053+
1054+ At the end of the body of `foo`, the compiler knows that `a` is
1055+ initialized, introducing a drop obligation (deallocating the boxed
1056+ integer) for the end of `a`'s scope that is run unconditionally.
1057+ Likewise the compiler knows that `d` is not initialized, and thus it
1058+ leave out the drop code for `d`.
1059+
1060+ The compiler cannot statically know the drop-state of `b` nor `c` at
1061+ the end of their scope, since that depends on the value of
1062+ `t`. Therefore, we need to insert boolean flags to track whether we
1063+ need to drop `b` and `c`.
1064+
1065+ However, the matter is not as simple as just mapping local variables
1066+ to their corresponding drop flags when necessary. In particular, in
1067+ addition to being able to move data out of local variables, Rust
1068+ allows one to move values in and out of structured data.
1069+
1070+ Consider the following:
1071+
1072+ ```rust
1073+ struct S { x: D, y: D, z: D }
1074+
1075+ fn foo(a: S, mut b: S, t: || -> bool) {
1076+ let mut c: S;
1077+ let d: S;
1078+ let e: S = a.clone();
1079+ if t() {
1080+ c = b;
1081+ b.x = e.y;
1082+ }
1083+ if t() { c.y = D::new(4); }
1084+ }
1085+ ```
1086+
1087+ As before, the drop obligations of `a` and `d` can be statically
1088+ determined, and again the state of `b` and `c` depend on dynamic
1089+ state. But additionally, the dynamic drop obligations introduced by
1090+ `b` and `c` are not just per-local boolean flags. For example, if the
1091+ first call to `t` returns `false` and the second call `true`, then at
1092+ the end of their scope, `b` will be completely initialized, but only
1093+ `c.y` in `c` will be initialized. If both calls to `t` return `true`,
1094+ then at the end of their scope, `c` will be completely initialized,
1095+ but only `b.x` will be initialized in `b`, and only `e.x` and `e.z`
1096+ will be initialized in `e`.
1097+
1098+ Note that we need to cover the `z` field in each case in some way,
1099+ since it may (or may not) need to be dropped, even though `z` is never
1100+ directly mentioned in the body of the `foo` function. We call a path
1101+ like `b.z` a *fragment sibling* of `b.x`, since the field `z` comes
1102+ from the same structure `S` that declared the field `x` in `b.x`.
1103+
1104+ In general we need to maintain boolean flags that match the
1105+ `S`-structure of both `b` and `c`. In addition, we need to consult
1106+ such a flag when doing an assignment (such as `c.y = D::new(4);`
1107+ above), in order to know whether or not there is a previous value that
1108+ needs to be dropped before we do the assignment.
1109+
1110+ So for any given function, we need to determine what flags are needed
1111+ to track its drop obligations. Our strategy for determining the set of
1112+ flags is to represent the fragmentation of the structure explicitly:
1113+ by starting initially from the paths that are explicitly mentioned in
1114+ moves and assignments (such as `b.x` and `c.y` above), and then
1115+ traversing the structure of the path's type to identify leftover
1116+ *unmoved fragments*: assigning into `c.y` means that `c.x` and `c.z`
1117+ are leftover unmoved fragments. Each fragment represents a drop
1118+ obligation that may need to be tracked. Paths that are only moved or
1119+ assigned in their entirety (like `a` and `d`) are treated as a single
1120+ drop obligation.
1121+
1122+ The fragment construction process works by piggy-backing on the
1123+ existing `move_data` module. We already have callbacks that visit each
1124+ direct move and assignment; these form the basis for the sets of
1125+ moved_leaf_paths and assigned_leaf_paths. From these leaves, we can
1126+ walk up their parent chain to identify all of their parent paths.
1127+ We need to identify the parents because of cases like the following:
1128+
1129+ ```rust
1130+ struct Pair<X,Y>{ x: X, y: Y }
1131+ fn foo(dd_d_d: Pair<Pair<Pair<D, D>, D>, D>) {
1132+ other_function(dd_d_d.x.y);
1133+ }
1134+ ```
1135+
1136+ In this code, the move of the path `dd_d.x.y` leaves behind not only
1137+ the fragment drop-obligation `dd_d.x.x` but also `dd_d.y` as well.
1138+
1139+ Once we have identified the directly-referenced leaves and their
1140+ parents, we compute the left-over fragments, in the function
1141+ `fragments::add_fragment_siblings`. As of this writing this works by
1142+ looking at each directly-moved or assigned path P, and blindly
1143+ gathering all sibling fields of P (as well as siblings for the parents
1144+ of P, etc). After accumulating all such siblings, we filter out the
1145+ entries added as siblings of P that turned out to be
1146+ directly-referenced paths (or parents of directly referenced paths)
1147+ themselves, thus leaving the never-referenced "left-overs" as the only
1148+ thing left from the gathering step.
1149+
1150+ ## Array structural fragments
1151+
1152+ A special case of the structural fragments discussed above are
1153+ the elements of an array that has been passed by value, such as
1154+ the following:
1155+
1156+ ```rust
1157+ fn foo(a: [D, ..10], i: uint) -> D {
1158+ a[i]
1159+ }
1160+ ```
1161+
1162+ The above code moves a single element out of the input array `a`.
1163+ The remainder of the array still needs to be dropped; i.e., it
1164+ is a structural fragment. Note that after performing such a move,
1165+ it is not legal to read from the array `a`. There are a number of
1166+ ways to deal with this, but the important thing to note is that
1167+ the semantics needs to distinguish in some manner between a
1168+ fragment that is the *entire* array versus a fragment that represents
1169+ all-but-one element of the array. A place where that distinction
1170+ would arise is the following:
1171+
1172+ ```rust
1173+ fn foo(a: [D, ..10], b: [D, ..10], i: uint, t: bool) -> D {
1174+ if t {
1175+ a[i]
1176+ } else {
1177+ b[i]
1178+ }
1179+
1180+ // When control exits, we will need either to drop all of `a`
1181+ // and all-but-one of `b`, or to drop all of `b` and all-but-one
1182+ // of `a`.
1183+ }
1184+ ```
1185+
1186+ There are a number of ways that the trans backend could choose to
1187+ compile this (e.g. a `[bool, ..10]` array for each such moved array;
1188+ or an `Option<uint>` for each moved array). From the viewpoint of the
1189+ borrow-checker, the important thing is to record what kind of fragment
1190+ is implied by the relevant moves.
1191+
10221192# Future work
10231193
10241194While writing up these docs, I encountered some rules I believe to be
0 commit comments