介绍 Rust 的 Borrow Checker 的原理。
前期知识:High-level Compiler Architecture
Queries: demand-driven compilation
正在从 pass-based 转变为 demand-driven 模式:
Instead of entirely independent passes (parsing, type-checking, etc.), a set of function-like queries compute information about the input source. For example, there is a query called
type_of
that, given theDefId
of some item, will compute the type of that item and return it to you.
上面的这些 query 是可以被记忆化的,所以在第一次被计算后,剩余的查询就可以从一个 hash table 中被检索出来。这对 Incremental Computation 是非常友好的。
最终,we want the entire compiler control-flow to be query driven. 也就是对于每个 crate,会运行一个 top-level 的 query 即 compile
。这会链式地触发后续的各种计算,比如:
- The compile query might demand to get a list of codegen-units,比如需要被 LLVM 编译的模块列表
- 但为了计算这些 codegen-units 就需要使用一个 subquery 计算 Rust 源码中定义的 module 列表
- 这个 subquery 就需要触发 HIR 的计算
- This keeps going further and further back until we wind up doing the actual parsing.
How the compiler executes a query
Providers
If, however, the query is not in the cache, then the compiler will try to find a suitable provider. A provider is a function that has been defined and linked into the compiler somewhere that contains the code to compute the result of the query.
How providers are setup
Memory Management in Rustc
前期知识:Source Code Representation
Intermediate representations 综述
Instead most compilers, including rustc, build some sort of IR out of the source code which is easier to analyze. rustc has a few IRs, each optimized for different purposes:
- Token stream: the lexer produces a stream of tokens directly from the source code. This stream of tokens is easier for the parser to deal with than raw text.
- Abstract Syntax Tree (AST): the abstract syntax tree is built from the stream of tokens produced by the lexer. It represents pretty much exactly what the user wrote. It helps to do some syntactic sanity checking (e.g. checking that a type is expected where the user wrote one).
- High-level IR (HIR): This is a sort of desugared AST. It’s still close to what the user wrote syntactically, but it includes some implicit things such as some elided lifetimes, etc. This IR is amenable to type checking.
- Typed HIR (THIR) formerly High-level Abstract IR (HAIR): This is an intermediate between HIR and MIR. It is like the HIR but it is fully typed and a bit more desugared,比如方法调用和隐式解引用都会被显式化. As a result, it is easier to lower to MIR from THIR than from HIR.
- Middle-level IR (MIR): This IR is basically a Control-Flow Graph (CFG). A CFG is a type of diagram that shows the basic blocks of a program and how control flow can go between them. Likewise, MIR also has a bunch of basic blocks with simple typed statements inside them (e.g. assignment, simple computations, etc) and control flow edges to other basic blocks (e.g., calls, dropping values). MIR is used for borrow checking and other important dataflow-based checks, such as checking for uninitialized values. It is also used for a series of optimizations and for constant evaluation (via MIRI). Because MIR is still generic, we can do a lot of analyses here more efficiently than after monomorphization.
- LLVM-IR: This is the standard form of all input to the LLVM compiler. LLVM-IR is a sort of typed assembly language with lots of annotations. It’s a standard format that is used by all compilers that use LLVM (e.g. the clang C compiler also outputs LLVM-IR). LLVM-IR is designed to be easy for other compilers to emit and also rich enough for LLVM to run a bunch of optimizations on it.
HIR
HIR – “High-Level Intermediate Representation”,是编译期友好的 AST。只会进行 parse、宏展开和 name resolution 的转化。
可以通过第一行的语句得到 HIR 表示,通过第二行的语句得到更为接近原文的 HIR 表示。
1 | cargo rustc -- -Z unpretty=hir-tree |
HIR Bodies
A rustc_hir::Body
represents some kind of executable code, such as the body of a function/closure or the definition of a constant. Bodies are associated with an owner, which is typically some kind of item (e.g. an fn()
or const
), but could also be a closure expression (e.g. |x, y| x + y
). You can use the HIR map to find the body associated with a given def-id (maybe_body_owned_by) or to find the owner of a body (body_owner_def_id).
THIR
THIR 也就是 Typed High-Level Intermediate Representation,从前叫 “High-Level Abstract IR。它在 type checking 后生成,被用来构造 MIR,exhaustiveness checking,以及 unsafety checking。
THIR 在 HIR 更下层。在 type checking 完成后,就能填入所有的 type。HIR 具有下面的特性:
- 类似于 MIR,THIR 只表示 “bodies”。其中包含 function bodies,const initializers 等。换句话说,THIR 中没有 struct 或者 trait 的表示。
- THIR 的 body 只是临时被存储,并且在不需要的时候就会被 drop 掉。对应的,HIR 的会存储到编译过程的结束。
- THIR 会有更多的 desugar。比如 automatic references and dereferences 会变得显式。method calls 和 overloaded operators 会转换为 plain function call。Destruction scopes 会显式。
这个我理解是因为 THIR 中已经没有 struct 了。 - Statements、expressions、match arms 会分开存储。
The THIR lives in rustc_mir_build::thir
. To construct a thir::Expr
, you can use the thir_body
function, passing in the memory arena where the THIR will be allocated. Dropping this arena will result in the THIR being destroyed, which is useful to keep peak memory in check. Having a THIR representation of all bodies of a crate in memory at the same time would be very heavy.
You can get a debug representation of the THIR by passing the -Zunpretty=thir-tree
flag to rustc.
下面的代码
1 | fn main() { |
对应的 THIR
1 | Thir { |
Control-flow Graph (CFG)
A control-flow graph is structured as a set of basic blocks connected by edges. The key idea of a basic block is that it is a set of statements that execute “together” – that is, whenever you branch to a basic block, you start at the first statement and then execute all the remainder. Only at the end of the block is there the possibility of branching to more than one place (in MIR, we call that final statement the terminator):
1 | bb0: { |
总而言之,basic block 是一个执行的整体。在 block 内部,不会有 branching。可以参考 Basic block placement 这个章节。
MIR
MIR is Rust’s Mid-level Intermediate Representation. It is constructed from HIR. MIR was introduced in RFC 1211. It is a radically simplified form of Rust that is used for certain flow-sensitive safety checks – notably the borrow checker! – and also for optimization and code generation.
Key MIR vocabulary
This section introduces the key concepts of MIR, summarized here:
- Basic blocks
见上文对 Basic block 的说明。 - Locals
Memory locations allocated on the stack (conceptually, at least), such as function arguments, local variables, and temporaries.
These are identified by an index, written with a leading underscore, like_1
. There is also a special “local” (_0
) allocated to store the return value. - Places: expressions that identify a location in memory, like
_1
or_1.f
. - Rvalues: expressions that produce a value. The “R” stands for the fact that these are the “right-hand side” of an assignment.
- Operands: the arguments to an rvalue, which can either be a constant (like 22) or a place (like
_1
).
- Operands: the arguments to an rvalue, which can either be a constant (like 22) or a place (like
Some statements like StorageLive are removed in optimization. This happens because the compiler notices the value is never accessed in the code. 可以通过 rustc [filename].rs -Z mir-opt-level=0 --emit mir
显示没有被优化过的 MIR。
一个样例
1 | fn main() { |
下面是 MIR
1 | // WARNING: This output format is intended for human consumers only |
debug vec => _1;
提供了 debug 信息。
StorageLive(_1);
表示 variable _1
is “live” 的,也就是稍后还会被使用,直到遇到一个 StorageDead(_1)
。这些标记被 LLVM 来分配栈空间。
<Place> = <Rvalue>
这样的是赋值语句。
- A place is an expression like
_3
,_3.f
or*_3
– it denotes a location in memory. - An Rvalue is an expression that creates a value: in this case, the rvalue is a mutable borrow expression, which looks like
&mut <Place>
. So we can kind of define a grammar for rvalues like so:1
2
3
4
5
6
7
8<Rvalue> = & (mut)? <Place>
| <Operand> + <Operand>
| <Operand> - <Operand>
| ...
<Operand> = Constant
| copy Place
| move Place
When you use a place, we indicate whether we are copying it (which requires that the place have a type T where T: Copy) or moving it (which works for a place of any type).
有关 Rust 的类型系统及其 Analysis
ty 模块
ty::Ty
The specific Ty we are referring to is rustc_middle::ty::Ty
(and not rustc_hir::Ty
). The distinction is important, so we will discuss it first before going into the details of ty::Ty
.
In contrast, ty::Ty
represents the semantics of a type, that is, the meaning of what the user wrote. For example, rustc_hir::Ty
would record the fact that a user used the name u32
twice in their program, but the ty::Ty
would record the fact that both usages refer to the same type.
Demo
1 | fn foo(x: u32) → u32 { x } |
In this function, we see that u32
appears twice. We know that that is the same type, i.e. the function takes an argument and returns an argument of the same type, but from the point of view of the HIR, there would be two distinct type instances because these are occurring in two different places in the program. That is, they have two different Spans (locations).
1 | fn foo(x: &u32) -> &u32 |
进一步的,HIR 可能会丢弃一些信息。比如 &u32
是一个 incomplete 的类型,因为它还缺少一个 lifetime。但我们并不需要写这些 lifetime,因为一些 elision rules 的缘故。其最终的表示类似于 fn foo<'a>(x: &'a u32) -> &'a u32
。
在 HIR 级别,这样的表示并没有被生成,所以我们可以说类型是 incomplete 的。但是在 ty::Ty
级别,这些信息会被补足,所以现在类型是 complete 了。进一步的,对于每一个类型,只会有一个 ty::Ty
。比如一个 u32 的 ty::Ty
会在整个程序中都被 share。这不同于 rustc_hir::Ty
。
Order
HIR is built directly from the AST, so it happens before any ty::Ty
is produced. After HIR is built, some basic type inference and type checking is done. During the type inference, we figure out what the ty::Ty
of everything is and we also check if the type of something is ambiguous. The ty::Ty
is then used for type checking while making sure everything has the expected type.
The hir_ty_lowering
module is where the code responsible for lowering a rustc_hir::Ty
to a ty::Ty
is located. The main routine used is lower_ty
.
How semantics drive the two instances of Ty
从类型推断的观点来说,HIR 去对类型进行更少的假设。我们假设两个类型是不同的,除非随后它们被证明是相同的。换句话说,知道的越少,假设就越少。
考虑 fn foo<T>(x: T) -> u32
. 考虑调用了 foo::<u32>(0)
. 此时,T 和 u32 最终都是同一个类型,所以最终使用同一个 ty::Ty
,但 rustc_hir::Ty 还是不同的。当然这个例子有点过于简单了,因为在 type checking 的时候,会 check the function generically and would still have a T distinct from u32。在后续的 code generation 的时候,才会进行 monomorphized,也就是对于泛型函数的每个版本生成对应的替换掉泛型变量的函数。
ty::Ty implementation
rustc_middle::ty::Ty
is actually a wrapper around Interned<WithCachedTypeInfo<TyKind>>
. Interned 可以忽略,它还起到一个指针的作用,反正解引用也可以被折叠。TyKind
is a big enum with variants to represent many different Rust types,比如原始类型、引用、ADT、泛型以及 lifetime 等。 WithCachedTypeInfo
has a few cached values like flags and outer_exclusive_binder
. They are convenient hacks for efficiency and summarize information about the type that we may want to know, but they don’t come into the picture as much here.
Allocating and working with types
To allocate a new type, you can use the various new_* methods defined on Ty. These have names that correspond mostly to the various kinds of types. For example:
1 | let array_ty = Ty::new_array_with_const_len(tcx, ty, count); |
类似的方法返回一个 Ty<'tcx>
。注意获得的 lifetime 是 tctx 所访问的哪个 arena 的 lifetime。Types are always canonicalized and interned (so we never allocate exactly the same type twice).
Comparing types
ty::TyKind Variants
ADTs Representation
Bound vars and parameters
Type inference
下面代码中的 things
的类型被推断为 Vec<&str>
,因为我们往 things 中加入了 &str
。
1 | fn main() { |
The type inference is based on the standard Hindley-Milner (HM) type inference algorithm, but extended in various way to accommodate subtyping, region inference, and higher-ranked types.
Inference variables
inference context 的主要目的是容纳一系列的 inference variable。这些表示那些具体值还没有被确定的 type 或者 region。这些值会在 type-checking 的时候被计算得到。
如果了解 HM 类型系统或者像 Prolog 的逻辑语言就能理解类似的概念。
All told, the inference context stores five kinds of inference variables (as of March 2023):
inference context 存放 5 种 inference variable:
- Type variables, which come in three varieties:
- General type variables (the most common). These can be unified with any type.
- Integral type variables, which can only be unified with an integral type, and arise from an integer literal expression like
22
. - Float type variables, which can only be unified with a float type, and arise from a float literal expression like
22.0
.
- Region variables, which represent lifetimes, and arise all over the place.
- Const variables, which represent constants.
All the type variables work in much the same way: you can create a new type variable, and what you get is Ty<'tcx>
representing an unresolved type ?T
. Then later you can apply the various operations that the inferencer supports, such as equality or subtyping, and it will possibly instantiate (or bind) that ?T
to a specific value as a result.
对于 Region variable 情况不同,会在稍后 Region constraints 中讨论。
补充说明:lexical region 和 non-lexical region
如下所示,在 non-lexical lifetime 出现之前,下面的代码会编译失败。
1 | fn main() { |
编译报错如下,当然我发现现如今的 rust 已经无法复现了
1 | error[E0502]: cannot borrow `scores` as mutable because it is also borrowed as immutable |
这里报错的原因是 score 是通过 lexical 的方式 borrow 的 scores 的。
Region constraints
Regions are inferenced somewhat differently from types. Rather than eagerly unifying things, we simply collect constraints as we go, but make (almost) no attempt to solve regions. These constraints have the form of an “outlives” constraint:
1 | 'a: 'b |
实际上这个代码将 'a
和 'b
视作了 subregion 的关系,但实际上是一个意思
1 | 'b <= 'a |
(There are various other kinds of constraints, such as “verifys”; see the region_constraints
module for details.)
但是依然有一个常见,我们会做一些 eager unification。也就是如果有一个 equality constraint between two regions,如
1 | 'a = 'b |
那么我们就会将这个事实记录在一个 unification table 中。可以使用 opportunistic_resolve_var
to convert 'b
to 'a
,或者反过来也可以. This is sometimes needed to ensure termination of fixed-point algorithms.
Solving region constraints
Region constraints are only solved at the very end of typechecking, once all other constraints are known and all other obligations have been proven. There are two ways to solve region constraints right now: lexical and non-lexical. Eventually there will only be one.
An exception here is the leak-check which is used during trait solving and relies on region constraints containing higher-ranked regions. Region constraints in the root universe (i.e. not arising from a for<'a>
) must not influence the trait system, as these regions are all erased during codegen.
To solve lexical region constraints, you invoke resolve_regions_and_report_errors
. This “closes” the region constraint process and invokes the lexical_region_resolve
code. Once this is done, any further attempt to equate or create a subtyping relationship will yield an ICE.
The NLL solver (actually, the MIR type-checker) does things slightly differently. It uses canonical queries for trait solving which use take_and_reset_region_constraints
at the end. This extracts all of the outlives constraints added during the canonical query. This is required as the NLL solver must not only know what regions outlive each other, but also where. Finally, the NLL solver invokes take_region_var_origins
, providing all region variables to the solver.
Lexical region resolution
Lexical region resolution is done by initially assigning each region variable to an empty value. We then process each outlives constraint repeatedly, growing region variables until a fixed-point is reached. Region variables can be grown using a least-upper-bound relation on the region lattice in a fairly straightforward fashion.
https://internals.rust-lang.org/t/how-does-region-inference-work/7511/3
1 | Constraints | Ordering | Region-lattice |
有关 Borrow checker
The borrow checker operates on the MIR. An older implementation operated on the HIR. Doing borrow checking on MIR has several advantages:
- The MIR is far less complex than the HIR; the radical desugaring helps prevent bugs in the borrow checker. (If you’re curious, you can see a list of bugs that the MIR-based borrow checker fixes here.)
- Even more importantly, using the MIR enables “non-lexical lifetimes”, which are regions derived from the control-flow graph.
Tracking moves and initialization
其作用如下,检查哪些变量是 uninitialized 的状态。
1 | fn foo() { |
因为 Rust 现在允许只 move 一个 field 比如 a.0
了,所以 trace local variable 是不够的。Rust 根据 move path 为粒度去 trace。
A MovePath represents some location that the user can initialize, move, etc. So e.g. there is a move-path representing the local variable a, and there is a move-path representing a.0. Move paths roughly correspond to the concept of a Place from MIR, but they are indexed in ways that enable us to do move analysis more efficiently.
所有的 MovePath
存储在一个 vector 中,我们通过 MovePathIndex
去访问。
One of the first things we do in the MIR borrow check is to construct the set of move paths. This is done as part of the MoveData::gather_moves
function. This function uses a MIR visitor called Gatherer
to walk the MIR and look at how each Place
within is accessed. For each such Place
, it constructs a corresponding MovePathIndex
. It also records when/where that particular move path is moved/initialized, but we’ll get to that in a later section.
We don’t actually create a move-path for every Place
that gets used. In particular, if it is illegal to move from a Place
, then there is no need for a MovePathIndex
. Some examples:
- You cannot move from a static variable, so we do not create a
MovePathIndex
for static variables. - You cannot move an individual element of an array, so if we have e.g.
foo: [String; 3]
, there would be no move-path forfoo[1]
. - You cannot move from inside of a borrowed reference, so if we have e.g.
foo: &String
, there would be no move-path for*foo
.
These rules are enforced by the move_path_for
function, which converts a Place
into a MovePathIndex
。在诸如上面的错误的场景下,返回错误 Err
。这也说明了我们并不需要 track 这些 Place
是否已经 initialized 了,从而减少了开销.
If you have a Place
and you would like to convert it to a MovePathIndex
, you can do that using the MovePathLookup
structure found in the rev_lookup
field of MoveData
. There are two different methods:
find_local
, which takes amir::Local
representing a local variable. This is the easier method, because we always create aMovePathIndex
for every local variable.find
, 可以处理任意的Place
。所以,只会返回一个LookupResult
,表示最近的 path。例如对foo[1]
返回foo
。
As we noted above, move-paths are stored in a big vector and referenced via their MovePathIndex
. 但是在这个 vector 中,它们也被构建为一棵树。例如 if you have the MovePathIndex
for a.b.c
, you can go to its parent move-path a.b
. 也可以遍历所有的 child path。比如对于 a.b
, you might iterate to find the path a.b.c
(here you are iterating just over the paths that are actually referenced in the source, not all possible paths that could have been referenced). These references are used for example in the find_in_move_path_or_its_descendants
function, which determines whether a move-path (e.g., a.b
) or any child of that move-path (e.g.,a.b.c
) matches a given predicate.
The MIR type-check
A key component of the borrow check is the MIR type-check. This check walks the MIR and does a complete “type check” – the same kind you might find in any other language. In the process of doing this type-check, we also uncover the region constraints that apply to the program.
User types
在 MIR type check 的开始,we replace all regions in the body with new unconstrained regions. However, this would cause us to accept the following program:
1 | fn foo<'a>(x: &'a u32) { |
By erasing the lifetimes in the type of y
we no longer know that it is supposed to be 'static
, ignoring the intentions of the user.
To deal with this we remember all places where the user explicitly mentioned a type during HIR type-check as CanonicalUserTypeAnnotations
.
There are two different annotations we care about:
- Explicit type ascriptions, 比如
let y: &'static u32
会产生UserType::Ty(&'static u32)
. - Explicit generic arguments, 比如
x.foo<&'a u32, Vec<String>>
会产生UserType::TypeOf(foo_def_id, [&'a u32, Vec<String>])
.
Drop Check
Implicit drop
通常,只要 local 被使用,就必须要 local 的 type 是 well-formed 的。This includes proving the where-bounds of the local and also requires all regions used by it to be live.
唯一的特例是在 value go out of scope 的时候,隐式 drop 掉 value,这不需要 value 是 live 的。
如下所示,x 在注释处已经 out of scope 了,并且这是在指向 y 的引用被 invalidate 之后。也就是说在 drop x
的时候,它的类型不是 well-formed 的。但这是个特例,实际上也是唯一 drop value 操作不需要访问任何 dead region 的情况。We check this by requiring the type of the value to be drop-live. The requirements for which are computed in fn dropck_outlives
.
1 | fn main() { |
How values are dropped
At its core, a value of type T is dropped by executing its “drop glue”. Drop glue is compiler generated and first calls <T as Drop>::drop
and then recursively calls the drop glue of any recursively owned values.
- If T has an explicit Drop impl, call
<T as Drop>::drop
. - Regardless of whether
T
implementsDrop
, recurse into all values owned by T:- references, raw pointers, function pointers, function items, trait objects1, and scalars do not own anything.
对于 trait object,可以认为它有一个内置的 Drop 实现,该实现会直接调用 vtable 中的drop_in_place
。这个 Drop 实现需要所有它所有的 generic parameter 都是 alive 的。 - tuples, slices, and arrays consider their elements to be owned. For arrays of length zero we do not own any value of the element type.
- all fields (of all variants) of ADTs are considered owned. We consider all variants for enums.
The exception here isManuallyDrop<U>
which is not considered to own U.
PhantomData<U>
also does not own anything. - closures and generators own their captured upvars.
- references, raw pointers, function pointers, function items, trait objects1, and scalars do not own anything.
可以通过 fn Ty::needs_drop
判断是否一个类型是否有 drop glue。
Partially dropping a local
如果一个 type 没有实现 Drop,就可以在 drop 掉剩下的成员前 move 掉一些其他的成员。此时,只有那些没有被 move 的成员会被触发 drop glue。
1 | struct PrintOnDrop<'a>(&'a str); |
但是如果遇到下面的代码,则会报错 cannot move out of type Tup<'_>, which implements the Drop trait
。
1 | struct Tup<'a> { |
During MIR building we assume that a local may get dropped whenever it goes out of scope as long as its type needs drop.
Computing the exact drop glue for a variable happens after borrowck in the ElaborateDrops
pass. 也就是说,即使 local 中的一些成员之前已经被 drop 了,dropck 依然需要这些 value 是 alive 的。
如下所示,完全 move 了 local 的情况下也是这样。x
borrow 了 temp,然后被 drop 了。但依然会有下面的报错。
1 | fn main() { |
dropck_outlives
There are two distinct “liveness” computations that we perform:
- a value v is use-live at location L if it may be “used” later; a use here is basically anything that is not a drop
- a value v is drop-live at location L if it maybe dropped later
When things are use-live
, their entire type must be valid at L.
When they are drop-live
, all regions that are required by dropck must be valid at L. The values dropped in the MIR are places.
Region inference (Non-Lexical Lifetime, NLL)
The MIR-based region checking code is located in the rustc_mir::borrow_check
module.
The MIR-based region analysis consists of two major functions:
replace_regions_in_mir
, invoked first, has two jobs:- First, it finds the set of regions that appear within the signature of the function (e.g.,
'a
infn foo<'a>(&'a u32) { ... }
). These are called the “universal” or “free” regions – in particular, they are the regions that appear free in the function body. - Second, it replaces all the regions from the function body with fresh inference variables. This is because (presently) those regions are the results of lexical region inference and hence are not of much interest. The intention is that – eventually – they will be “erased regions” (i.e., no information at all), since we won’t be doing lexical region inference at all.
- First, it finds the set of regions that appear within the signature of the function (e.g.,
compute_regions
, invoked second: this is given as argument the results of move analysis. It has the job of computing values for all the inference variables thatreplace_regions_in_mir
introduced.- To do that, it first runs the MIR type checker. This is basically a normal type-checker but specialized to MIR, which is much simpler than full Rust, of course. Running the MIR type checker will however create various constraints between region variables, indicating their potential values and relationships to one another.
- After this, we perform constraint propagation by creating a
RegionInferenceContext
and invoking its solve method. - The NLL RFC also includes fairly thorough (and hopefully readable) coverage.
Universal regions
Reference
- https://rustc-dev-guide.rust-lang.org/appendix/background.html
编译器的一些基础知识。 - https://rustc-dev-guide.rust-lang.org/hir.html
HIR。 - https://rustc-dev-guide.rust-lang.org/thir.html
THIR。 - https://rustc-dev-guide.rust-lang.org/ty.html
关于 rust 类型系统的介绍。 - https://blog.logrocket.com/introducing-the-rust-borrow-checker/
- https://rustc-dev-guide.rust-lang.org/borrow_check.html
Rust Compiler Development Guide 上的讲解 - https://www.zybuluo.com/darwin-yuan/note/424724
Hindley-Milner类型系统