ADTs and Generic Arguments
The term ADT stands for "Algebraic data type", in rust this refers to a struct, enum, or union.
ADTs Representation
Let's consider the example of a type like MyStruct<u32>, where MyStruct is defined like so:
struct MyStruct<T> { x: u8, y: T }
The type MyStruct<u32> would be an instance of TyKind::Adt:
Adt(&'tcx AdtDef, GenericArgs<'tcx>)
// ------------ ---------------
// (1) (2)
//
// (1) represents the `MyStruct` part
// (2) represents the `<u32>`, or "substitutions" / generic arguments
There are two parts:
- The
AdtDefreferences the struct/enum/union but without the values for its type parameters. In our example, this is theMyStructpart without the argumentu32. (Note that in the HIR, structs, enums and unions are represented differently, but inty::Ty, they are all represented usingTyKind::Adt.) - The
GenericArgsis a list of values that are to be substituted for the generic parameters. In our example ofMyStruct<u32>, we would end up with a list like[u32]. We’ll dig more into generics and substitutions in a little bit.
AdtDef and DefId
For every type defined in the source code, there is a unique DefId (see this
chapter). This includes ADTs and generics. In the MyStruct<T>
definition we gave above, there are two DefIds: one for MyStruct and one for T. Notice that
the code above does not generate a new DefId for u32 because it is not defined in that code (it
is only referenced).
AdtDef is more or less a wrapper around DefId with lots of useful helper methods. There is
essentially a one-to-one relationship between AdtDef and DefId. You can get the AdtDef for a
DefId with the tcx.adt_def(def_id) query. AdtDefs are all interned, as shown
by the 'tcx lifetime.
Question: Why not substitute “inside” the AdtDef?
Recall that we represent a generic struct with (AdtDef, args). So why bother with this scheme?
Well, the alternate way we could have chosen to represent types would be to always create a new,
fully-substituted form of the AdtDef where all the types are already substituted. This seems like
less of a hassle. However, the (AdtDef, args) scheme has some advantages over this.
First, (AdtDef, args) scheme has an efficiency win:
struct MyStruct<T> {
... 100s of fields ...
}
// Want to do: MyStruct<A> ==> MyStruct<B>
in an example like this, we can instantiate MyStruct<A> as MyStruct<B> (and so on) very cheaply,
by just replacing the one reference to A with B. But if we eagerly instantiated all the fields,
that could be a lot more work because we might have to go through all of the fields in the AdtDef
and update all of their types.
A bit more deeply, this corresponds to structs in Rust being nominal types — which means that they are defined by their name (and that their contents are then indexed from the definition of that name, and not carried along “within” the type itself).
The GenericArgs type
Given a generic type MyType<A, B, …>, we have to store the list of generic arguments for MyType.
In rustc this is done using GenericArgs. GenericArgs is a thin pointer to a slice of GenericArg representing a list of generic arguments for a generic item. For example, given a struct HashMap<K, V> with two type parameters, K and V, the GenericArgs used to represent the type HashMap<i32, u32> would be represented by &'tcx [tcx.types.i32, tcx.types.u32].
GenericArg is conceptually an enum with three variants, one for type arguments, one for const arguments and one for lifetime arguments.
In practice that is actually represented by GenericArgKind and GenericArg is a more space efficient version that has a method to
turn it into a GenericArgKind.
The actual GenericArg struct stores the type, lifetime or const as an interned pointer with the discriminant stored in the lower 2 bits.
Unless you are working with the GenericArgs implementation specifically, you should generally not have to deal with GenericArg and instead
make use of the safe GenericArgKind abstraction obtainable via the GenericArg::unpack() method.
In some cases you may have to construct a GenericArg, this can be done via Ty/Const/Region::into() or GenericArgKind::pack.
// An example of unpacking and packing a generic argument.
fn deal_with_generic_arg<'tcx>(generic_arg: GenericArg<'tcx>) -> GenericArg<'tcx> {
// Unpack a raw `GenericArg` to deal with it safely.
let new_generic_arg: GenericArgKind<'tcx> = match generic_arg.unpack() {
GenericArgKind::Type(ty) => { /* ... */ }
GenericArgKind::Lifetime(lt) => { /* ... */ }
GenericArgKind::Const(ct) => { /* ... */ }
};
// Pack the `GenericArgKind` to store it in a generic args list.
new_generic_arg.pack()
}
So pulling it all together:
struct MyStruct<T>(T);
type Foo = MyStruct<u32>
For the MyStruct<U> written in the Foo type alias, we would represent it in the following way:
- There would be an
AdtDef(and correspondingDefId) forMyStruct. - There would be a
GenericArgscontaining the list[GenericArgKind::Type(Ty(u32))] - And finally a
TyKind::Adtwith theAdtDefandGenericArgslisted above.