logo

Rust for Rustaceans - Part 2: Types

Published on

Types

1. Types in Memory

1.1 Alignment

All values, no matter their type, must start at a byte boundary because pointers point to bytes, not bits.

Compiler gives every type an alignment that is computed based on the types that it contains. Built-in values are usually aligned to their size, so u8 is byte-aligned and u16 is 2-byte-aligned.

1.2 Layout

Rust provide a repr attribute that you can add to the type definitions to request a particular in-memory representation for that type.

repr(C) is a layout which is compatible with how a C or C++ compiler would lay out the same type. Since the C layout is predictable and not subject to change, repr(C) is also useful in unsafe contexts if you’re working with raw pointers into the type, or if you need to cast between two different types that you know have the same fields.

#[repr(C)]
struct Foo {
  tiny: bool,
  normal: u32,
  small: u8,
  long: u64,
  short: u16,
}
memory = 1 byte (tiny)
  + 3 bytes (padding between tiny and normal)
  + 4 bytes (normal)
  + 1 byte (small)
  + 7 bytes (padding between small and long)
  + 8 bytes (long)
  + 2 bytes (short)
  + 6 bytes (padding to adjust multiple of 8 bytes )

26 bytes

1.3 Complex Types

How the compiler represents other Rust types in memory:

  • Tuple

Represented like a struct with fields of the same type as the tuple values in the same order.

  • Array

Represented as a contiguous sequence of the contained type with no padding between the elements.

  • Union

Layout is chosen independently for each variant. Alignment is the maximum across all the variants.

  • Enumeration

Same as union, but with one additional hidden shared field that stores the enum variant discriminant.

1.4 Dynamically Sized Types and Wide Pointers

  • DST

Most types in Rust implement Sized automatically - that is, they have a size that's known at compile time.

But two common types are exception: trait objects and slices.

For example, a dyb Iterator or a [u8], those don't have a well-defined size. Their size depends on some information that is known only when the proggram runs and not at compile time, which is why they are called dynamic sized types (DSTs)

  • Wide Pointer

The way to bridge the gap between unsized and sized types is to place unsized types behind a wide pointer (also known as a fat pointer).

Wide pointer specifically is twice the size of usize (the size of a word on the target platform): one usize for holding pointer, and one usize for holding the extra information needed to "complete" the type.

2. Traits and Trait Bounds

A trait is a way to define shared behavior in an abstract manner. It is similar to interfaces in other programming languages.

2.1 Compilation and Dispatch

  • Static Dispatch
impl String {
  pub fn contains(&self, p: impl Pattern) -> bool {
    p.is_contained_in(self)
  }
}

For any given pattern, the compiler knows that that address is the address of the place where that pattern type implements that trait method. But there is no one address we could use for any type, so we need to have one copy for each type, each with its own address to jump to. This is referred to as static dispatch, since for any given copy of the method, the address we are “dispatching to” is known statically.

  • Dynamic Dispatch
impl String {
  pub fn contains(&self, p: &dyn Pattern) -> bool {
    p.is_contained_in(&*self)
  }
}

The alternative to static dispatch is dynamic dispatch, which enables code to call a trait method on a generic type without knowing what that type is.

If you replace impl Pattern with &dyn Pattern, you tell the caller that they must give two pieces of information for this argument: the address of the pattern and the address of the is_contained_in method.

You can use any type that is able to hold a wide pointer for dynamic dispatch, such as &mut, Box, and Arc.

Usually, static dispatch is used for libraries and dynamic dispatch is used for binaries.

2.2 Generic Traits

Rust traits can be generic in one of two ways:

  1. generic type parameters like trait Foo<T>
  2. associated types like trait Foo { type Bar }

Use an associated type if you expect only one implementation of the trait for a given type, and use a generic type parameter otherwise. Use associated types whenever you can.

2.3 Coherence and the Orphan Rule

  • Coherence

The coherence property refers to a set of rules that the compiler enforces to ensure that trait implementations are coherent and do not conflict with one another.

  • Orphan Rule

The orphan rule state that you can only implemnt a trait for a type if at least one of the following conditions is met:

  1. The trait is defined in your crate
  2. The type is defined in your crate