ADTs and Generic Arguments
The term ADT
stands for "Algebraic data type", in rust this refers to a struct, enum, or union.
ADTs Representation
Let's consider the example of a type like MyStruct<u32>
, where MyStruct
is defined like so:
struct MyStruct<T> { x: u8, y: T }
The type MyStruct<u32>
would be an instance of TyKind::Adt
:
Adt(&'tcx AdtDef, GenericArgs<'tcx>)
// ------------ ---------------
// (1) (2)
//
// (1) represents the `MyStruct` part
// (2) represents the `<u32>`, or "substitutions" / generic arguments
There are two parts:
- The
AdtDef
references the struct/enum/union but without the values for its type parameters. In our example, this is theMyStruct
part without the argumentu32
. (Note that in the HIR, structs, enums and unions are represented differently, but inty::Ty
, they are all represented usingTyKind::Adt
.) - The
GenericArgs
is a list of values that are to be substituted for the generic parameters. In our example ofMyStruct<u32>
, we would end up with a list like[u32]
. We’ll dig more into generics and substitutions in a little bit.
AdtDef
and DefId
For every type defined in the source code, there is a unique DefId
(see this
chapter). This includes ADTs and generics. In the MyStruct<T>
definition we gave above, there are two DefId
s: one for MyStruct
and one for T
. Notice that
the code above does not generate a new DefId
for u32
because it is not defined in that code (it
is only referenced).
AdtDef
is more or less a wrapper around DefId
with lots of useful helper methods. There is
essentially a one-to-one relationship between AdtDef
and DefId
. You can get the AdtDef
for a
DefId
with the tcx.adt_def(def_id)
query. AdtDef
s are all interned, as shown
by the 'tcx
lifetime.
Question: Why not substitute “inside” the AdtDef
?
Recall that we represent a generic struct with (AdtDef, args)
. So why bother with this scheme?
Well, the alternate way we could have chosen to represent types would be to always create a new,
fully-substituted form of the AdtDef
where all the types are already substituted. This seems like
less of a hassle. However, the (AdtDef, args)
scheme has some advantages over this.
First, (AdtDef, args)
scheme has an efficiency win:
struct MyStruct<T> {
... 100s of fields ...
}
// Want to do: MyStruct<A> ==> MyStruct<B>
in an example like this, we can instantiate MyStruct<A>
as MyStruct<B>
(and so on) very cheaply,
by just replacing the one reference to A
with B
. But if we eagerly instantiated all the fields,
that could be a lot more work because we might have to go through all of the fields in the AdtDef
and update all of their types.
A bit more deeply, this corresponds to structs in Rust being nominal types — which means that they are defined by their name (and that their contents are then indexed from the definition of that name, and not carried along “within” the type itself).
The GenericArgs
type
Given a generic type MyType<A, B, …>
, we have to store the list of generic arguments for MyType
.
In rustc this is done using GenericArgs
. GenericArgs
is a thin pointer to a slice of GenericArg
representing a list of generic arguments for a generic item. For example, given a struct HashMap<K, V>
with two type parameters, K
and V
, the GenericArgs
used to represent the type HashMap<i32, u32>
would be represented by &'tcx [tcx.types.i32, tcx.types.u32]
.
GenericArg
is conceptually an enum
with three variants, one for type arguments, one for const arguments and one for lifetime arguments.
In practice that is actually represented by GenericArgKind
and GenericArg
is a more space efficient version that has a method to
turn it into a GenericArgKind
.
The actual GenericArg
struct stores the type, lifetime or const as an interned pointer with the discriminant stored in the lower 2 bits.
Unless you are working with the GenericArgs
implementation specifically, you should generally not have to deal with GenericArg
and instead
make use of the safe GenericArgKind
abstraction obtainable via the GenericArg::unpack()
method.
In some cases you may have to construct a GenericArg
, this can be done via Ty/Const/Region::into()
or GenericArgKind::pack
.
// An example of unpacking and packing a generic argument.
fn deal_with_generic_arg<'tcx>(generic_arg: GenericArg<'tcx>) -> GenericArg<'tcx> {
// Unpack a raw `GenericArg` to deal with it safely.
let new_generic_arg: GenericArgKind<'tcx> = match generic_arg.unpack() {
GenericArgKind::Type(ty) => { /* ... */ }
GenericArgKind::Lifetime(lt) => { /* ... */ }
GenericArgKind::Const(ct) => { /* ... */ }
};
// Pack the `GenericArgKind` to store it in a generic args list.
new_generic_arg.pack()
}
So pulling it all together:
struct MyStruct<T>(T);
type Foo = MyStruct<u32>
For the MyStruct<U>
written in the Foo
type alias, we would represent it in the following way:
- There would be an
AdtDef
(and correspondingDefId
) forMyStruct
. - There would be a
GenericArgs
containing the list[GenericArgKind::Type(Ty(u32))]
- And finally a
TyKind::Adt
with theAdtDef
andGenericArgs
listed above.