Types
1. Types in Memory
1.1 Alignment
All values, no matter their type, must start at a byte boundary because pointers point to bytes, not bits.
Compiler gives every type an alignment that is computed based on the types that it contains. Built-in values are usually aligned to their size, so u8 is byte-aligned and u16 is 2-byte-aligned.
1.2 Layout
Rust provide a repr
attribute that you can add to the type definitions to request a particular in-memory representation for that type.
repr(C)
is a layout which is compatible with how a C or C++ compiler would lay out the same type. Since the C layout is predictable and not subject to change, repr(C) is also useful in unsafe contexts if you’re working with raw pointers into the type, or if you need to cast between two different types that you know have the same fields.
#[repr(C)]
struct Foo {
tiny: bool,
normal: u32,
small: u8,
long: u64,
short: u16,
}
memory = 1 byte (tiny)
+ 3 bytes (padding between tiny and normal)
+ 4 bytes (normal)
+ 1 byte (small)
+ 7 bytes (padding between small and long)
+ 8 bytes (long)
+ 2 bytes (short)
+ 6 bytes (padding to adjust multiple of 8 bytes )
26 bytes
1.3 Complex Types
How the compiler represents other Rust types in memory:
- Tuple
Represented like a struct with fields of the same type as the tuple values in the same order.
- Array
Represented as a contiguous sequence of the contained type with no padding between the elements.
- Union
Layout is chosen independently for each variant. Alignment is the maximum across all the variants.
- Enumeration
Same as union, but with one additional hidden shared field that stores the enum variant discriminant.
1.4 Dynamically Sized Types and Wide Pointers
- DST
Most types in Rust implement Sized
automatically - that is, they have a size that's known at compile time.
But two common types are exception: trait objects
and slices
.
For example, a dyb Iterator
or a [u8]
, those don't have a well-defined size. Their size depends on some information that is known only when the proggram runs and not at compile time, which is why they are called dynamic sized types (DSTs)
- Wide Pointer
The way to bridge the gap between unsized and sized types is to place unsized types behind a wide pointer
(also known as a fat pointer
).
Wide pointer specifically is twice the size of usize (the size of a word on the target platform): one usize for holding pointer
, and one usize for holding the extra information
needed to "complete" the type.
2. Traits and Trait Bounds
A trait
is a way to define shared behavior in an abstract manner. It is similar to interfaces in other programming languages.
2.1 Compilation and Dispatch
- Static Dispatch
impl String {
pub fn contains(&self, p: impl Pattern) -> bool {
p.is_contained_in(self)
}
}
For any given pattern, the compiler knows that that address is the address of the place where that pattern type implements that trait method. But there is no one address we could use for any type, so we need to have one copy for each type, each with its own address to jump to. This is referred to as static dispatch
, since for any given copy of the method, the address we are “dispatching to” is known statically.
- Dynamic Dispatch
impl String {
pub fn contains(&self, p: &dyn Pattern) -> bool {
p.is_contained_in(&*self)
}
}
The alternative to static dispatch is dynamic dispatch
, which enables code to call a trait method on a generic type without knowing what that type is.
If you replace impl Pattern
with &dyn Pattern
, you tell the caller that they must give two pieces of information for this argument: the address of the pattern and the address of the is_contained_in method.
You can use any type that is able to hold a wide pointer
for dynamic dispatch, such as &mut
, Box
, and Arc
.
Usually, static dispatch is used for libraries and dynamic dispatch is used for binaries.
2.2 Generic Traits
Rust traits can be generic in one of two ways:
- generic type parameters like trait
Foo<T>
- associated types like trait
Foo { type Bar }
Use an associated type if you expect only one implementation of the trait for a given type, and use a generic type parameter otherwise. Use associated types whenever you can.
2.3 Coherence and the Orphan Rule
- Coherence
The coherence
property refers to a set of rules that the compiler enforces to ensure that trait implementations are coherent and do not conflict with one another.
- Orphan Rule
The orphan rule
state that you can only implemnt a trait for a type if at least one of the following conditions is met:
- The trait is defined in your crate
- The type is defined in your crate