Pattern and Exhaustiveness Checking

In Rust, pattern matching and bindings have a few very helpful properties. The compiler will check that bindings are irrefutable when made and that match arms are exhaustive.

Pattern usefulness

The central question that usefulness checking answers is: "in this match expression, is that branch redundant?". More precisely, it boils down to computing whether, given a list of patterns we have already seen, a given new pattern might match any new value.

For example, in the following match expression, we ask in turn whether each pattern might match something that wasn't matched by the patterns above it. Here we see the 4th pattern is redundant with the 1st; that branch will get an "unreachable" warning. The 3rd pattern may or may not be useful, depending on whether Foo has other variants than Bar. Finally, we can ask whether the whole match is exhaustive by asking whether the wildcard pattern (_) is useful relative to the list of all the patterns in that match. Here we can see that _ is useful (it would catch (false, None)); this expression would therefore get a "non-exhaustive match" error.

#![allow(unused)]
fn main() {
// x: (bool, Option<Foo>)
match x {
    (true, _) => {} // 1
    (false, Some(Foo::Bar)) => {} // 2
    (false, Some(_)) => {} // 3
    (true, None) => {} // 4
}
}

Thus usefulness is used for two purposes: detecting unreachable code (which is useful to the user), and ensuring that matches are exhaustive (which is important for soundness, because a match expression can return a value).

Where it happens

This check is done anywhere you can write a pattern: match expressions, if let, let else, plain let, and function arguments.

#![allow(unused)]
fn main() {
// `match`
// Usefulness can detect unreachable branches and forbid non-exhaustive matches.
match foo() {
    Ok(x) => x,
    Err(_) => panic!(),
}

// `if let`
// Usefulness can detect unreachable branches.
if let Some(x) = foo() {
    // ...
}

// `while let`
// Usefulness can detect infinite loops and dead loops.
while let Some(x) = it.next() {
    // ...
}

// Destructuring `let`
// Usefulness can forbid non-exhaustive patterns.
let Foo::Bar(x, y) = foo();

// Destructuring function arguments
// Usefulness can forbid non-exhaustive patterns.
fn foo(Foo { x, y }: Foo) {
    // ...
}
}

The algorithm

Exhaustiveness checking is run before MIR building in check_match. It is implemented in the rustc_pattern_analysis crate, with the core of the algorithm in the usefulness module. That file contains a detailed description of the algorithm.

Important concepts

Constructors and fields

In the value Pair(Some(0), true), Pair is called the constructor of the value, and Some(0) and true are its fields. Every matchable value can be decomposed in this way. Examples of constructors are: Some, None, (,) (the 2-tuple constructor), Foo {..} (the constructor for a struct Foo), and 2 (the constructor for the number 2).

Each constructor takes a fixed number of fields; this is called its arity. Pair and (,) have arity 2, Some has arity 1, None and 42 have arity 0. Each type has a known set of constructors. Some types have many constructors (like u64) or even an infinitely many (like &str and &[T]).

Patterns are similar: Pair(Some(_), _) has constructor Pair and two fields. The difference is that we get some extra pattern-only constructors, namely: the wildcard _, variable bindings, integer ranges like 0..=10, and variable-length slices like [_, .., _]. We treat or-patterns separately.

Now to check if a value v matches a pattern p, we check if v's constructor matches p's constructor, then recursively compare their fields if necessary. A few representative examples:

  • matches!(v, _) := true
  • matches!((v0, v1), (p0, p1)) := matches!(v0, p0) && matches!(v1, p1)
  • matches!(Foo { a: v0, b: v1 }, Foo { a: p0, b: p1 }) := matches!(v0, p0) && matches!(v1, p1)
  • matches!(Ok(v0), Ok(p0)) := matches!(v0, p0)
  • matches!(Ok(v0), Err(p0)) := false (incompatible variants)
  • matches!(v, 1..=100) := matches!(v, 1) || ... || matches!(v, 100)
  • matches!([v0], [p0, .., p1]) := false (incompatible lengths)
  • matches!([v0, v1, v2], [p0, .., p1]) := matches!(v0, p0) && matches!(v2, p1)

This concept is absolutely central to pattern analysis. The constructor module provides functions to extract, list and manipulate constructors. This is a useful enough concept that variations of it can be found in other places of the compiler, like in the MIR-lowering of a match expression and in some clippy lints.

Constructor grouping and splitting

The pattern-only constructors (_, ranges and variable-length slices) each stand for a set of normal constructors, e.g. _: Option<T> stands for the set {None, Some} and [_, .., _] stands for the infinite set {[,], [,,], [,,,], ...} of the slice constructors of arity >= 2.

In order to manage these constructors, we keep them as grouped as possible. For example:

#![allow(unused)]
fn main() {
match (0, false) {
    (0 ..=100, true) => {}
    (50..=150, false) => {}
    (0 ..=200, _) => {}
}
}

In this example, all of 0, 1, .., 49 match the same arms, and thus can be treated as a group. In fact, in this match, the only ranges we need to consider are: 0..50, 50..=100, 101..=150,151..=200 and 201... Similarly:

#![allow(unused)]
fn main() {
enum Direction { North, South, East, West }
let wind = (Direction::North, 0u8);
match wind {
    (Direction::North, 50..) => {}
    (_, _) => {}
}
}

Here we can treat all the non-North constructors as a group, giving us only two cases to handle: North, and everything else.

This is called "constructor splitting" and is crucial to having exhaustiveness run in reasonable time.

Usefulness vs reachability in the presence of empty types

This is likely the subtlest aspect of exhaustiveness. To be fully precise, a match doesn't operate on a value, it operates on a place. In certain unsafe circumstances, it is possible for a place to not contain valid data for its type. This has subtle consequences for empty types. Take the following:

#![allow(unused)]
fn main() {
enum Void {}
let x: u8 = 0;
let ptr: *const Void = &x as *const u8 as *const Void;
unsafe {
    match *ptr {
        _ => println!("Reachable!"),
    }
}
}

In this example, ptr is a valid pointer pointing to a place with invalid data. The _ pattern does not look at the contents of the place *ptr, so this code is ok and the arm is taken. In other words, despite the place we are inspecting being of type Void, there is a reachable arm. If the arm had a binding however:

#![allow(unused)]
fn main() {
#[derive(Copy, Clone)]
enum Void {}
let x: u8 = 0;
let ptr: *const Void = &x as *const u8 as *const Void;
unsafe {
match *ptr {
    _a => println!("Unreachable!"),
}
}
}

Here the binding loads the value of type Void from the *ptr place. In this example, this causes UB since the data is not valid. In the general case, this asserts validity of the data at *ptr. Either way, this arm will never be taken.

Finally, let's consider the empty match match *ptr {}. If we consider this exhaustive, then having invalid data at *ptr is invalid. In other words, the empty match is semantically equivalent to the _a => ... match. In the interest of explicitness, we prefer the case with an arm, hence we won't tell the user to remove the _a arm. In other words, the _a arm is unreachable yet not redundant. This is why we lint on redundant arms rather than unreachable arms, despite the fact that the lint says "unreachable".

These considerations only affects certain places, namely those that can contain non-valid data without UB. These are: pointer dereferences, reference dereferences, and union field accesses. We track during exhaustiveness checking whether a given place is known to contain valid data.

Having said all that, the current implementation of exhaustiveness checking does not follow the above considerations. On stable, empty types are for the most part treated as non-empty. The exhaustive_patterns feature errs on the other end: it allows omitting arms that could be reachable in unsafe situations. The never_patterns experimental feature aims to fix this and permit the correct behavior of empty types in patterns.