C++ to Rust Phrasebook

This book is designed to help C++ programmers learn Rust. It provides translations of common C++ patterns into idiomatic Rust. Each pattern is described through concrete code examples along with high-level discussion of engineering trade-offs.

The book can be read front-to-back, but it is designed to be used random-access. When you are writing Rust code and think, "I know how to do this in C++ but not Rust," then look for the corresponding chapter in this book.

This book was hand-written by expert C++ and Rust programmers at Brown University's Cognitive Engineering Lab. Our goal is to provide accurate information with a tasteful degree of detail. No text in this book was written by AI.

If you would like updates on when we add new chapters to this book, you can drop your email here.

Other resources

If you have zero Rust experience, you might consider first reading The Rust Programming Language or getting a quick overview at Learn X in Y Minutes.

If you are primarily an embedded systems programmer using C or C++, this book is a complement to The Embedded Rust Book.

Compared to resources like the Rustonomicon and Learn Rust With Entirely Too Many Linked Lists, this book is less about "Rust behind the scenes" and more about explicitly describing how Rust works in terms of C++.

Feedback on this book

At the bottom of every page there is a link to a form where you can submit feedback: typos, factual errors, or any other issues you spot.

If you answer the quizzes at the end of each chapter, we will save your responses anonymously for research purposes.

Constructors

In C++, constructors initialize objects. At the point when a constructor is executed, storage for the object has been allocated and the constructor is only performing initialization.

Rust does not have constructors in the same way as C++. In Rust, there is a single fundamental way to create an object, which is to initialize all of its members at once. The term "constructor" or "constructor method" in Rust refers to something more like a factory: a static method associated with a type (i.e., a method that does not have a self parameter), which returns a value of the type.

#include <thread>
unsigned int cpu_count() { 
    return std::thread::hardware_concurrency();
}

class ThreadPool {
  unsigned int num_threads;  

public:
  ThreadPool() : num_threads(cpu_count()) {}
  ThreadPool(unsigned int nt) : num_threads(nt) {}
};

int main() {
  ThreadPool p1;
  ThreadPool p2(4);
}
fn cpu_count() -> usize {
    std::thread::available_parallelism().unwrap().get()
}

struct ThreadPool {
  num_threads: usize
}

impl ThreadPool {
    fn new() -> Self {
        Self { num_threads: cpu_count() }
    }

    fn with_threads(nt: usize) -> Self {
        Self { num_threads: nt } 
    }
}

fn main() {
    let p1 = ThreadPool::new();
    let p2 = ThreadPool::with_threads(4);
}

In Rust, typically the primary constructor for a type is named new, especially if it takes no arguments. (See the chapter on default constructors.) Constructors based on some specific property of the value are usually named with_<something>, e.g., ThreadPool::with_threads. See the naming guidelines for the conventions on how to name constructor methods in Rust.

If the fields to be initialized are visible, there is a reasonable default value, and the value does not manage a resource, then it is also common to use record update syntax to initialize a value based on some default value.

struct Point {
    x: i32,
    y: i32,
    z: i32,
}

impl Point {
    const fn zero() -> Self {
        Self { x: 0, y: 0, z: 0 }
    }
}

fn main() {
    let x_unit = Point {
        x: 1,
        ..Point::zero()
    };
}

Despite the name, "record update syntax" does not modify a record but instead creates a new value based on another one, taking ownership of it in order to do so.

Storage allocation vs initialization

In Rust, the actual construction of a structure or enum value occurs where the structure construction syntax ThreadPool { ... } is, after the evaluation of the expressions for the fields.

A significant implication of this difference is that storage is not allocated for a struct in Rust at the point where the constructor method (such as ThreadPool::with_threads) is called, and in fact is not allocated until after the values of the fields of a struct have been computed (in terms of the semantics of the language — the optimizer may still avoid the copy). Therefore there is no straightforward way in Rust to translate patterns such as a class which stores a pointer to itself upon construction (in Rust, this requires tools like Pin and MaybeUninit).

Fallible constructors

In C++, the primary way constructors can indicate failure is by throwing exceptions. In Rust, because constructors are normal static methods, fallible constructors can instead return Result (akin to std::expected) or Option (akin to std::optional).

#include <iostream>
#include <stdexcept>

class ThreadPool {
  unsigned int num_threads;

public:
  ThreadPool(unsigned int nt) : num_threads(nt) {
    if (num_threads == 0) {
      throw std::domain_error("Cannot have zero threads");
    }
  }
};

int main() {
  try {
    ThreadPool p(0);
  } catch (const std::domain_error &e) {
    std::cout << e.what() << std::endl;
  }
}
struct ThreadPool {
    num_threads: usize,
}

impl ThreadPool {
    fn with_threads(nt: usize) -> Result<Self, String> {
        if nt == 0 {
            Err("Cannot have zero threads".to_string())
        } else {
            Ok(Self { num_threads: nt })
        }
    }
}

fn main() {
    match ThreadPool::with_threads(0) {
        Err(err) => println!("{err}"),        
        Ok(p) => { /* ... */ }
    }
}

See the chapter on exceptions for more information on how C++ exceptions and exception handling translate to Rust.

Default constructors

C++ has a special concept of default constructors to support several scenarios in which they are implicitly called. Rust does not have the same notion of a default constructor. The most similar mechanism is the Default trait.

class Person {
    int age;

public:
    // Default constructor
    Person() : age(0) {}
}
#![allow(unused)]
fn main() {
struct Person {
   age: i32,
}

impl Person {
    pub const fn new() -> Self {
        Self { age: 0 }
    }
}

impl Default for Person {
    fn default() -> Self {
        Self::new()
    }
}
}

If a structure has a useful default value (such as would be constructed by a default constructor in C++), then the type should provide both a new method that takes no arguments and an implementation of Default.

Implicit initialization of class members

In C++, if a member is not explicitly initialized by a constructor, then it is default-initialized. When the type of the member is a class, the default-initialization invokes the default constructor.

In Rust, if all of the fields of a struct implement the Default trait, then an implementation for the structure can be provided by the compiler.

class Person {
  int age;

public:
  Person() : age(0) {}
}

class Student {
  Person person;
}
#![allow(unused)]
fn main() {
#[derive(Default)]
struct Person {
    age: i32,
}

#[derive(Default)]
struct Student {
    person: Person,
}
}

The #[derive(Default)] macros in Rust are equivalent to writing the following.

#![allow(unused)]
fn main() {
struct Person {
    age: i32,
}

impl Default for Person {
    fn default() -> Self {
        Self {
            age: Default::default()
        }
    }
}

struct Student {
    person: Person,
}

impl Default for Student {
    fn default() -> Self {
        Self {
            person: Default::default()
        }
    }
}
}

Unlike C++ where the default initialization value for integers is indeterminate, in Rust the default value for the primitive integer and floating point types is zero.

Deriving the Default trait has a similar effect on code concision as eliding initialization in C++. In situations where all of the types implement the Default trait, but only some of the fields should have their default values, one can use struct update syntax to define a constructor method without enumerating the values for all of the fields.

#![allow(unused)]
fn main() {
#[derive(Default)]
struct Person {
    age: i32,
}

#[derive(Default)]
struct Student {
    person: Person,
    favorite_color: Option<String>,
}

impl Student {
    pub fn with_favorite_color(color: String) -> Self {
        Student {
            favorite_color: Some(color),
            ..Default::default()
        }
    }
}
}

Implicit initialization of array values

In C++, arrays without explicit initialization are default-initialized using the default constructors.

In Rust, the value with which to initialize the array must be provided.

class Person {
  int age;

public:
  Person() : age(0) {}
};

int main() {
  Person people[3];
  // ...
}
#[derive(Default)]
struct Person {
    age: i32,
}

fn main() {
    // std::array::from_fn provides the index to the callback
    let people: [Person; 3] = 
        std::array::from_fn(|_| Default::default());
    // ...
}

If the type happens to be trivially copyable, then a shorthand can be used.

#[derive(Clone, Copy, Default)]
struct Person {
    age: i32,
}

fn main() {
    let people: [Person; 3] = [Default::default(); 3];
    // ...
}

Container element initialization

In C++, the default constructor could be used to implicitly define collection types, such as std::vector. Before C++11, one value would be default constructed, and the elements would be copy constructed from that initial element. Since C++11, all elements are default constructed.

As with array initialization, the values must be explicitly specified in Rust. The vector can be constructed from an array, enabling the same syntax as with arrays.

#include <vector>

class Person {
    int age;

public:
    Person() : age(0) {}
}

int main() {
    std::vector<Person> people(3);
    // ...
}
#[derive(Default)]
struct Person {
    age: i32,
}

fn main() {
    let people_arr: [Person; 3] = 
        std::array::from_fn(|_| Default::default());
    let people: Vec<Person> = Vec::from(people_arr);
    // ...
}

In Rust, the vector can also be constructed from an iterator.

#[derive(Default)]
struct Person {
    age: i32,
}

fn main() {
    let people: Vec<Person> = (0..3).map(|_| Default::default()).collect();
    // ...
}

If the type implements the Clone trait, then the array can be constructed using the vec! macro. See the chapter on copy constructors for more details on Clone.

#[derive(Clone, Default)]
struct Person {
    age: i32,
}

fn main() {
    let people: Vec<Person> = vec![Default::default(); 3];
    // ...
}

Implicit initialization of local variables

In C++, the default constructor is used to perform default-initialization of local variables that are not explicitly initialized.

In Rust, initialization of local variables is always explicit.

class Person {
    int age;

public:
    Person() : age(0) {}
};

int main() {
    Person person;
    // ...
}
#[derive(Clone, Default)]
struct Person {
    age: i32,
}

fn main() {
    let person = Person::default();
    // ...
}

Implicit initialization of the base class object

In C++, the default constructor is used to initialize the base class object if no other constructor is specified.

class Base {
  int x;

public:
  Base() : x(0) {}
};

class Derived : Base {
public:
  // Calls the default constructor for Base
  Derived() {}
};

Since Rust does not have inheritance, there is no equivalent to this case. See the chapter on implementation reuse or the section on traits in the Rust book for alternatives.

std::unique_ptr

There are some additional cases where the Default trait is used in Rust, but default constructors are not used for initialization in C++.

Rust's equivalent of smart pointers implement Default by delegating to the Default implementation of the contained type.

#[derive(Default)]
struct Person {
    age: i32,
}

fn main() {
    let b: Box<Person> = Default::default();
    // ...
}

This differs from the treatment of std::unique_ptr in C++ because unlike Box, std::unique_ptr is nullable, and so the default constructor for std:unique_ptr produces a pointer that owns nothing. The equivalent type in Rust is Option<Box<Person>>, for which the Default implementation produces None.

Other uses of Default

Option::unwrap_or_default makes use of Default, which makes getting a default value when the Option does not contain a value more convenient.

#![allow(unused)]
fn main() {
fn go(x: Option<i32>) {
    let a: i32 = x.unwrap_or_default();
    // if x was None, then a is 0

    // ...
}
}

In C++, std::optional does not have an equivalent method.

Copy and move constructors

In both C++ and Rust, one rarely has to write copy or move constructors (or their Rust equivalents) by hand. In C++ this is because the implicit definitions are good enough for most purposes, especially when using smart pointers (i.e., following the rule of zero). In Rust this is because move semantics are the default, and the automatically derived implementations of the Clone and Copy traits are good enough for most purposes.

For the following C++ classes, the implicitly defined copy and move constructors are sufficient. The equivalent in Rust uses a derive macro provided by the standard library to implement the corresponding traits.

#include <memory>
#include <string>

struct Age {
  unsigned int years;

  Age(unsigned int years) : years(years) {}

  // copy and move constructors and destructor
  // implicitly declared and defined
};

struct Person {
  Age age;
  std::string name;
  std::shared_ptr<Person> best_friend;

  Person(Age age,
         std::string name,
         std::shared_ptr<Person> best_friend)
      : age(age), name(name),
        best_friend(best_friend) {}

  // copy and move constructors and destructor
  // implicitly declared and defined
};
#![allow(unused)]
fn main() {
use std::rc::Rc;

#[derive(Clone, Copy)]
struct Age {
    years: u32,
}

#[derive(Clone)]
struct Person {
    age: Age,
    name: String,
    best_friend: Rc<Person>,
}
}

User-defined constructors

On the other hand, the following example requires a user-defined copy and move constructor because it manages a resource (a pointer acquired from a C library). The equivalent in Rust requires a custom implementation of the Clone trait.

#include <cstdlib>
#include <cstring>

// widget.h
struct widget_t;
widget_t *alloc_widget();
void free_widget(widget_t *);
void copy_widget(widget_t *dst, widget_t *src);

// widget.cc
class Widget {
  widget_t *widget;

public:
  Widget() : widget(alloc_widget()) {}

  Widget(const Widget &other) : widget(alloc_widget()) {
    copy_widget(widget, other.widget);
  }

  Widget(Widget &&other) : widget(other.widget) {
    other.widget = nullptr;
  }

  ~Widget() {
    free_widget(widget);
  }
};
#![allow(unused)]
fn main() {
mod example {
mod widget_ffi {
    // Models an opaque type.
    // See https://doc.rust-lang.org/nomicon/ffi.html#representing-opaque-structs
    #[repr(C)]
    pub struct CWidget {
        _data: [u8; 0],
        _marker: core::marker::PhantomData<(
            *mut u8,
            core::marker::PhantomPinned,
        )>,
    }

    extern "C" {
        pub fn make_widget() -> *mut CWidget;
        pub fn copy_widget(
            dst: *mut CWidget,
            src: *mut CWidget,
        );
        pub fn free_widget(ptr: *mut CWidget);
    }
}

use self::widget_ffi::*;

struct Widget {
    widget: *mut CWidget,
}

impl Widget {
    fn new() -> Self {
        Widget {
            widget: unsafe { make_widget() },
        }
    }
}

impl Clone for Widget {
    fn clone(&self) -> Self {
        let widget = unsafe { make_widget() };
        unsafe {
            copy_widget(widget, self.widget);
        }
        Widget { widget }
    }
}

impl Drop for Widget {
    fn drop(&mut self) {
        unsafe { free_widget(self.widget) };
    }
}
}
}

Just as with how in C++ it is uncommon to need user-defined implementations for copy and move constructors or user-defined implementations for destructors, in Rust it is rare to need to implement the Clone and Drop traits by hand for types that do not represent resources.

There is one exception to this. If the type has type parameters, it might be desirable to implement Clone (and Copy) manually even if the clone should be done field-by-field. See the standard library documentation of Clone and of Copy for details.

Trivially copyable types

In C++, a class type is trivially copyable when it has no non-trivial copy constructors, move constructors, copy assignment operators, move assignment operators and it has a trivial destructor. Values of a trivially copyable type are able to be copied by copying their bytes.

In the first C++ example above, Age is trivially copyable, but Person is not. This is because despite using a default copy constructor, the constructor is not trivial because std::string and std::shared_ptr are not trivially copyable.

Rust indicates whether types are trivially copyable with the Copy trait. Just as with trivially copyable types in C++, values of types that implement Copy in Rust can be copied by copying their bytes. Rust requires explicit calls to the clone method to make copies of values of types that do not implement Copy.

In the first Rust example above, Age implements the Copy trait but Person does not. This is because neither std::String nor Rc<Person> implement Copy. They do not implement Copy because they own data that lives on the heap, and so are not trivially copyable.

Rust prevents implementing Copy for a type if any of its fields are not Copy, but does not prevent implementing Copy for types that should not be copied bit-for-bit due to their intended meaning, which is usually indicated by a user-defined Clone implementation.

Rust does not permit the implementation of both Copy and Drop for the same type. This aligns with the C++ standard's requirement that trivially copyable types not implement a user-defined destructor.

Move constructors

In Rust, all types support move semantics by default, and custom move semantics cannot be (and do not need to be) defined. This is because what "move" means in Rust is not the same as it is in C++. In Rust, moving a value means changing what owns the value. In particular, there is no "old" object to be destructed after a move, because the compiler will prevent the use of a variable whose value has been moved.

Assignment operators

Rust does not have a copy or move assignment operator. Instead, assignment either moves (by transferring ownership), explicitly clones and then moves, or implicitly copies and then moves.

fn main() {
    let x = Box::<u32>::new(5);
    let y = x; // moves
    let z = y.clone(); // explicitly clones and then moves the clone
    let w = *y; // implicitly copies the content of the Box and then moves the copy
}

For situations where something like a user-defined copy assignment could avoid allocations, the Clone trait has an additional method called clone_from. The method is automatically defined, but can be overridden when implementing the Clone trait to provide an efficient implementation.

The method is not used for normal assignments, but can be explicitly used in situations where the performance of the assignment is significant and would be improved by using the more efficient implementation, if one is defined. The implementation can be made more efficient because clone_from takes ownership of the object to which the values are being assigned, and so can do things like reuse memory to avoid allocations.

#![allow(unused)]
fn main() {
fn go(x: &Vec<u32>) {
    let mut y = vec![0; x.len()];
    // ...
    y.clone_from(&x);
    // ...
}
}

Performance concerns and Copy

The decision to implement Copy should be based on the semantics of the type, not on performance. If the size of objects being copied is a concern, then one should instead use a reference (&T or &mut T) or put the value on the heap (Box<T> or Rc<T>). These approaches correspond to passing by reference, or using a std::unique_ptr or std::shared_ptr in C++.

Rule of three/five/zero

Rule of three

In C++ the rule of three is a rule of thumb that if a class has a user-defined destructor, copy constructor or copy assignment operator, it probably should have all three.

The corresponding rule for Rust is that if a type has a user-defined Clone or Drop implementation, it probably needs both. This is for the same reason as the rule of three in C++: if a type has a user-defined implementation for Clone or Drop, it is probably because the type manages a resource, and both Clone and Drop will need to take special actions for the resource.

Rule of five

The rule of five in C++ states that if move semantics are needed for a type with a user-defined copy constructor or copy assignment operator, then a user-defined move constructor and move assignment should also be provided, because no implicit move constructor or move assignment operator will be generated.

In Rust, this rule is not relevant because of the difference in move semantics between C++ and Rust.

Rule of zero

The rule of zero states that classes with user-defined copy/move constructors, assignment operators, and destructors should deal only with ownership, and other classes should not have those constructors or destructors. In practice, most classes should make use of types from the STL (shared_ptr, vector, etc.) for dealing with ownership concerns so that the implicitly defined copy and move constructors are sufficient.

In Rust, the same is true. See the list of Rust type equivalents for equivalents of C++ smart pointer types and equivalents of C++ container types.

One difference between C++ and Rust in applying the rule of zero is that in C++ std::unique_ptr can take a custom deleter, making it possible to use std::unique_ptr for wrapping raw pointers that require custom destruction logic. In Rust, the Box type is not parameterized in the same way. To accomplish the same goal, one instead must define a new type with a user-defined Drop implementation, as is done in the example in the chapter on copy and move constructors.

Destructors and resource cleanup

In C++, a destructor for a class T is defined by providing a special member function ~T(). To achieve the equivalent in Rust, the Drop trait is implemented for a type.

For an example, see the chapter on copy and move constructors.

Drop implementations play the same role as destructors in C++ for types that manage resources. That is, they enable cleanup of resources owned by the value at the end of the value's lifetime.

In Rust the Drop::drop method of a value is called automatically by a destructor when the variable that owns the value goes out of scope. Unlike in C++, the drop method cannot be called manually. Instead the automatic "drop glue" implicitly calls the destructors of fields.

Lifetimes and destructors

C++ destructors are called in reverse order of construction when variables go out of scope, or for dynamically allocated objects, when they are deleted. This includes destructors of moved-from objects.

In Rust, the drop order is similar to that of C++ (reverse order of declaration). If additional specific details about the drop order are needed (e.g., for writing unsafe code), the full rules for the drop order are described in the language reference. However, moving an object in Rust does not leave a moved-from object on which a destructor will be called.

#include <iostream>
#include <utility>

struct A {
  int id;

  A(int id) : id(id) {}

  // copy constructor
  A(A &other) : id(other.id) {}

  // move constructor
  A(A &&other) : id(other.id) {
    other.id = 0;
  }

  // destructor
  ~A() {
    std::cout << id << std::endl;
  }
};

int accept(A x) {
  return x.id;
} // the destructor of x is called after the
  // return expression is evaluated

// Prints:
// 2
// 3
// 0
// 1
int main() {
  A x(1);
  A y(2);

  accept(std::move(y));

  A z(3);

  return 0;
}
struct A {
    id: i32,
}

impl Drop for A {
    fn drop(&mut self) {
        println!("{}", self.id)
    }
}

fn accept(x: A) -> i32 {
    return x.id;
}

// Prints:
// 2
// 3
// 1
fn main() {
    let x = A { id: 1 };
    let y = A { id: 2 };

    accept(y);

    let z = A { id: 3 };
}

In Rust, after ownership of y is moved into the function accept, there is no additional object remaining, and so there is no additional Drop::drop call (which in the C++ example prints 0).

Rust's drop methods do run when leaving scope due to a panic, though not if the panic occurs in a destructor that was called in response to an initial panic.

Early cleanup and explicitly destroying values

In C++ you can explicitly destroy an object. This is mainly useful for situations where placement new has been used to allocate the object at a specific memory location, and so the destructor will not be implicitly called.

However, once the destructor has been explicitly called, it may not be called again, even implicitly. Thus the destructor can't be used for early cleanup. Instead, either the class must be designed with a separate cleanup method that releases the resources but leaves the object in a state where the destructor can be called or the function using the object must be structured so that the variable goes out of scope at the desired time.

In Rust, values can be dropped early for early cleanup by using std::mem::drop. This works because (for non-Copy types) ownership of the object is actually transferred to std::mem::drop function, and so Drop::drop is called at the end of std::mem::drop when the lifetime of the parameter ends.

Thus, std::mem::drop can be used for early cleanup of resources without having to restructure a function to force variables out of scope early.

For example, the following allocates a large vector on the heap, but explicitly drops it before allocating a second large vector on the heap, reducing the overall memory usage.

fn main() {
    let v = vec![0u32; 100000];
    // ... use v

    std::mem::drop(v);
    // can no longer use v here

    let v2 = vec![0u32; 100000];
    // ... use v2
}

Data modeling

In C++ the mechanisms available for data modeling are classes, enums, and unions.

Rust, on the other hand, uses records (structs) and algebraic data types (enums).

Although Rust supports one major piece of object oriented design, polymorphism using interfaces, Rust also has language features for modeling things using algebraic data types (which in simple cases are like a much more ergonomic std::variant).

This section gives examples of common constructions used when programming in C++ and how to achieve the same effects using Rust's features.

Fixed operations, varying data

In situations where one needs to model a fixed set of operations that clients will use, but the data that implements those operations are not fixed ahead of time, the approach in C++ and the approach in Rust are the same. In both cases interfaces that defines the required operations are defined. Concrete types, possibly defined by the client, implement those interfaces.

This way of modeling data can make use of either dynamic or static dispatch, each of which is covered in its own section.

Fixed data, varying operations

In situations where there is a fixed set of data but the operations that the data must support vary, there are a few approaches in C++. Which approaches are available to use depend on the version of the standard in use.

In older versions of the standard, one might use manually defined tagged unions. In newer versions, std::variant is available to improve the safety and ergonomics of tagged unions. Both of these approaches map to the same approach in Rust.

Additionally, despite it not being strictly necessary to model a fixed set of variants, the visitor pattern is sometimes used for this situation, especially when using versions of the C++ standard before the introduction of std::variant. In most of these cases the idiomatic Rust solution is the same as what one would do when converting a C++ solution that uses tagged unions. The chapter on the visitor pattern describes when to use a Rust version of the visitor pattern or when to use Rust's enums (which are closer to std::variant than to C++ enums) to model the data.

Varying data and operations

When both data and operations may be extended by a client, the visitor pattern is used in both C++ and in Rust.

Abstract classes, interfaces, and dynamic dispatch

In C++ when an interface will be used with dynamic dispatch to resolve invoked methods, the interface is defined using an abstract class. Types that implement the interface inherit from the abstract class. In Rust the interface is given by a trait, which is then implemented for the types that support that trait. Programs can then be written over trait objects that use that trait as their base type.

The following example defines an interface, two implementations of that interface, and a function that takes an argument that satisfies the interface. In C++ the interface is defined with an abstract class with pure virtual methods, and in Rust the interface is defined with a trait. In both languages, the function (printArea in C++ and print_area in Rust) invokes a method using dynamic dispatch.

#include <iostream>
#include <memory>

// Define an abstract class for an interface
struct Shape {
  Shape() = default;
  virtual ~Shape() = default;
  virtual double area() = 0;
};

// Implement the interface for a concrete class
struct Triangle : public Shape {
  double base;
  double height;

  Triangle(double base, double height)
      : base(base), height(height) {}

  double area() override {
    return 0.5 * base * height;
  }
};

// Implement the interface for a concrete class
struct Rectangle : public Shape {
  double width;
  double height;

  Rectangle(double width, double height)
      : width(width), height(height) {}

  double area() override {
    return width * height;
  }
};

// Use an object via a reference to the interface
void printArea(Shape &shape) {
  std::cout << shape.area() << std::endl;
}

int main() {
  Triangle triangle = Triangle{1.0, 1.0};

  printArea(triangle);

  // Use an object via an owned pointer to the
  // interface
  std::unique_ptr<Shape> shape;
  if (true) {
    shape = std::make_unique<Rectangle>(1.0, 1.0);
  } else {
    shape = std::make_unique<Triangle>(
        std::move(triangle));
  }

  // Convert to a reference to the interface
  printArea(*shape);
}
// Define an interface
trait Shape {
    fn area(&self) -> f64;
}

struct Triangle {
    base: f64,
    height: f64,
}

// Implement the interface for a concrete type
impl Shape for Triangle {
    fn area(&self) -> f64 {
        0.5 * self.base * self.height
    }
}

struct Rectangle {
    width: f64,
    height: f64,
}

// Implement the interface for a concrete type
impl Shape for Rectangle {
    fn area(&self) -> f64 {
        self.width * self.height
    }
}

// Use a value via a reference to the interface
fn print_area(shape: &dyn Shape) {
    println!("{}", shape.area());
}

fn main() {
    let triangle = Triangle {
        base: 1.0,
        height: 1.0,
    };

    print_area(&triangle);

    // Use a value via an owned pointer to the
    // interface
    let shape: Box<dyn Shape> = if true {
        Box::new(Rectangle {
            width: 1.0,
            height: 1.0,
        })
    } else {
        Box::new(triangle)
    };

    // Convert to a reference to the interface
    print_area(shape.as_ref());
}

There are several places where the Rust implementation differs slightly from the C++ implementation.

In Rust, a trait's methods are always visible whenever the trait itself is visible. Additionally, the fact that a type implements a trait is always visible whenever both the trait and the type are visible. These properties of Rust explain the lack of visibility declarations in places where one might find them in C++.

In C++, to associate methods with a type rather than value of that type, you use the static keyword. In Rust, non-static methods take an explicit self parameter. This syntactic choice makes it possible to indicate (in way similar to other parameters) whether the method mutates the object (by taking &mut self instead of &self) and whether it takes ownership of the object (by taking self instead of &self).

Rust methods do not need to be declared as virtual. Because of differences in vtable representation, all methods for a type are available for dynamic dispatch. Types of values that use vtables are indicated with the dyn keyword. This is further described below.

Additionally, Rust does not have an equivalent for the virtual destructor declaration because in Rust every vtable includes the drop behavior (whether given by a user defined Drop implementation or not) required for the value.

Vtables and Rust trait object types

C++ and Rust both requires some kind of indirection to perform dynamic dispatch against an interface. In C++ this indirection takes the form of a pointer to the abstract class (instead of the derived concrete class), making use of a vtable to resolve the virtual method.

In the above Rust example, the type dyn Shape is the type of a trait object for the Shape trait. A trait object includes a vtable along with the underlying value.

In C++ all objects whose class inherits from a class with a virtual method have a vtable in their representation, whether dynamic dispatch is used or not. Pointers or references to objects are the same size as pointers to objects without virtual methods, but every object includes its vtable.

In Rust, vtables are present only when values are represented as trait objects. The reference to the trait object is twice the size of a normal reference since it includes both the pointer to the value and the pointer to the vtable. In the Rust example above, the local variable triangle in main does not have a vtable in its representation, but when the reference to it is converted to a reference to a trait object (so that it can be passed to print_area), that does include a pointer to the vtable.

Additionally, just as abstract classes in C++ cannot be used as the type of a local variable, the type of a parameter of a function, or the type of a return value of a function, trait object types in Rust cannot be used in corresponding contexts. In Rust, this is enforced by the type dyn Shape not implementing the Sized marker trait, preventing it from being used in contexts that require knowing the size of a type statically.

The following example shows some places where a trait object type can and cannot be used due to not implementing Sized. The uses forbidden in Rust would also be forbidden in C++ because Shape is an abstract class.

trait Shape {
    fn area(&self) -> f64;
}

struct Triangle {
    base: f64,
    height: f64,
}

impl Shape for Triangle {
    fn area(&self) -> f64 {
        0.5 * self.base * self.height
    }
}

fn main() {
    // Local variables must have a known size.
    // let v: dyn Shape = Triangle { base: 1.0, height: 1.0 };

    // References always have a known size.
    let shape: &dyn Shape = &Triangle {
        base: 1.0,
        height: 1.0,
    };
    // Boxes also always have a known size.
    let boxed_shape: Box<dyn Shape> = Box::new(Triangle {
        base: 1.0,
        height: 1.0,
    });

    // Types like Option<T> the value of type T directly, and so also need to
    // know the size of T.
    // let v: Option<dyn Shape> = Some(Triangle { base: 1.0, height: 1.0 });
}

// Parameter types must have a known size.
// fn print_area(shape: dyn Shape) { }
fn print_area(shape: &dyn Shape) {}

The decision to include the vtable in the reference instead of in the value is one part of what makes it reasonable to use traits both for polymorphism via dynamic dispatch and for polymorphism via static dispatch, where one would use concepts in C++.

Limitations of trait objects in Rust

In Rust, not all traits can be used as the base trait for trait objects. The most commonly encountered restriction is that traits that require knowledge of the object's size via a Sized supertrait are not dyn-compatible. There are additional restrictions.

Trait objects and lifetimes

Objects which are used with dynamic dispatch may contain pointers or references to other objects. In C++ the lifetimes of those references must be tracked manually by the programmer.

Rust checks the bounds on the lifetimes of references that the trait objects may contain. If the bounds are not given explicitly, they are determined according to the lifetime elision rules. The bound is part of the type of the trait object.

Usually the elision rules pick the correct lifetime bound. Sometimes, the rules result in surprising error messages from the compiler. In those situations or when the compiler cannot determine which lifetime bound to assign, the bound may be given manually. The following example shows explicitly what the inferred lifetimes are for a structure storing a trait object and for the print_area function.

trait Shape {
    fn area(&self) -> f64;
}

struct Triangle {
    base: f64,
    height: f64,
}

impl Shape for Triangle {
    fn area(&self) -> f64 {
        0.5 * self.base * self.height
    }
}

struct Scaled {
    scale: f64,
    // 'static is the lifetime that would be inferred by the lifetime elision
    // rule [lifetime-elision.trait-object.default].
    shape: Box<dyn Shape + 'static>,
}

impl Shape for Scaled {
    fn area(&self) -> f64 {
        self.scale * self.shape.area()
    }
}

// These are the lifetimes that would be inferred by the lifetime elision rule
// [lifetime-elision.function.implicit-lifetime-parameters] for the reference
// and [lifetime-elision.trait-object.containing-type-unique] for the trait
// bound.
fn print_area<'a>(shape: &'a (dyn Shape + 'a)) {
    println!("{}", shape.area());
}

fn main() {
    let triangle = Triangle {
        base: 1.0,
        height: 1.0,
    };
    print_area(&triangle);

    let scaled_triangle = Scaled {
        scale: 2.0,
        shape: Box::new(triangle),
    };
    print_area(&scaled_triangle);
}

Concepts, interfaces, and static dispatch

In C++, static dispatch over an interface is achieved by implementing a template function or template method that interacts with the type using some expected interface.

The template function twiceArea in the example below makes use of an area() method on the template type parameter.

To achieve the same goal in Rust involves defining a trait (Shape) with the desired method (twice_area) and using the trait as a bound on the type parameter for the generic function.

#include <iostream>

struct Triangle {
  double base;
  double height;

  Triangle(double base, double height)
      : base(base), height(height) {}

  // NOT virtual: it will be used with static dispatch
  double area() {
    return 0.5 * base * height;
  }
};

// Generic function using interface
template <class T>
double twiceArea(T &shape) {
  return shape.area() * 2;
}

int main() {
  Triangle triangle{1.0, 1.0};

  std::cout << twiceArea(triangle) << std::endl;
  return 0;
}
// Interface that generic function will use
trait Shape {
    fn area(&self) -> f64;
}

struct Triangle {
    base: f64,
    height: f64,
}

// Implementation of interface for type
impl Shape for Triangle {
    fn area(&self) -> f64 {
        0.5 * self.base * self.height
    }
}

// Generic function using interface
fn twice_area<T: Shape>(shape: &T) -> f64 {
    2.0 * shape.area()
}

fn main() {
    let triangle = Triangle {
        base: 1.0,
        height: 1.0,
    };

    println!("{}", twice_area(&triangle));
}

Note that in the Rust example, the definition of the trait and the struct have not changed from the example in the chapter on virtual methods and dynamic dispatch. Even so, this example does use static dispatch. This is the result of a design trade-off in Rust around the representation of vtables and vptrs which is described later in that chapter.

The difference between Rust and C++ in the above examples arises from Rust being nominally typed (types must opt in to supporting a specific interface, merely having the right methods isn't enough) and C++'s template meta-programming enabling a kind of structural or duck typing (types only need to have the methods actually used, and there is no need to explicitly opt in to supporting an interface).

Templates vs generic functions

The reason why Rust is nominally typed instead of structurally typed has to do with the difference between C++ templates and Rust generic functions. In particular, C++ templates are only type checked after all of the template arguments are provided and they are fully expanded, while Rust generic functions are type checked independently of the type arguments.

Since the functions are checked before the type arguments are known, the methods and functions that can be applied to values of those types also need to be known before the type arguments are known.

This point in the programming language design space favors simplicity of reasoning about these functions over the flexibility that comes from the template programming approach. This becomes especially valuable when writing libraries that both provide generic functions defined in terms of other generic functions, for which a C++ compiler can give many fewer static guarantees, since it would not be possible to test all possible instantiations.

In both C++ and Rust, however, multiple implementations are generated by the compiler in order to achieve static dispatch.

C++ constraints and concepts

Rust's approach to static dispatch over an interface can be partially (but only partially) modeled with a strict application of C++ concepts.

The usual way to apply concepts is still structural and does not model Rust's approach: it only requires that a method with specific properties be present on the type.

#include <concepts>

template <typename T>
concept shape = requires(T t) {
  { t.area() } -> std::same_as<double>;
};

template <shape T>
double twiceArea(T shape) {
  return shape.area() * 2;
}

A closer equivalent to the above Rust program in C++ is to use a combination of abstract classes and concepts.

#include <concepts>

struct Shape {
  Shape() {}
  virtual ~Shape() {}
  virtual double area() = 0;
};

template <typename T>
concept shape = std::derived_from<T, Shape>;

struct Triangle : Shape {
  double base;
  double height;

  Triangle(double base, double height) : base(base), height(height) {}

  // still NOT virtual: will be used static dispatch
  double area() override {
    return 0.5 * base * height;
  }
};

template <shape T>
double twiceArea(T shape) {
  return shape.area() * 2;
}

int main() {
  Triangle triangle{1.0, 1.0};

  std::cout << twiceArea(triangle) << std::endl;
  return 0;
}

This is still not the same, however, because the concept only creates a requirement on the use of the template, not on the use of values of type T within the template. In Rust, the trait bound constrains both. So the following still compiles in C++.

#include <concepts>

struct Shape {
  Shape() {}
  virtual ~Shape() {}
  virtual double area() = 0;
};

template <typename T>
concept shape = std::derived_from<T, Shape>;

template <shape T>
double twiceArea(T shape) {
  // note the call to a method not defined in Shape
  return shape.volume() * 2;
}

However, the equivalent does not compile in Rust and instead produces an error.

trait Shape {
    fn area(&self) -> f64;
}

fn twice_area<T: Shape>(shape: &T) -> f64 {
    // note the call to a method not defined in Shape
    2.0 * shape.volume()
}
error[E0599]: no method named `volume` found for reference `&T` in the current scope
 --> example.rs:7:17
  |
7 |     2.0 * shape.volume()
  |                 ^^^^^^ method not found in `&T`

These additional static checks mean that in many situations where C++ templates would be useful but hard to implement correctly, Rust generics are freely used.

Required traits and ergonomics

In the above examples, the function requiring a trait was defined like the following.

fn twice_area<T: Shape>(shape: &T) -> f64 {
    2.0 * shape.area()
}

This is a commonly used shorthand for the following:

fn twice_area<T>(shape: &T) -> f64
where
    T: Shape,
{
    2.0 * shape.area()
}

The more verbose form is preferred when there are many type parameters or those type parameters must implement many traits. An even shorter-hand available in some cases is the impl keyword:

fn twice_area(shape: &impl Shape) -> f64 {
    2.0 * shape.area()
}

Generics and lifetimes

When defining a template in C++ that makes use of a type template parameter, the lifetimes of references stored within objects of that type must be tracked manually by the programmer.

The following (contrived) C++ example compiles without error, but could be used in a way that results in undefined behavior.

#include <memory>

struct Shape {
  Shape() {}
  virtual ~Shape() {}
  virtual double area() = 0;
};

template<typename S>
void store(S s, std::unique_ptr<Shape> data) {
    // Will pointers or references in `s` become dangling while `data`
    // is still in use?
	*data = s;
}

Rust checks the bounds on lifetimes of references contained within type parameters. Just as with trait object types, these bounds are usually inferred according to the lifetime elision rules. When they cannot be inferred, or they are inferred incorrectly, the bounds can be declared manually.

In the Rust transliteration of the above example, the lifetime bounds have to be given manually because the inferred bounds are incorrect. Without explicit bounds, the compiler produces an error.

trait Shape {}

fn store<S: Shape>(x: S, data: &mut Box<dyn Shape>) {
    *data = Box::new(x);
}
error[E0310]: the parameter type `S` may not live long enough
 --> example.rs:7:5
  |
7 |     *data = Box::new(x);
  |     ^^^^^
  |     |
  |     the parameter type `S` must be valid for the static lifetime...
  |     ...so that the type `S` will meet its required lifetime bounds
  |

The error message becomes clearer when the inferred lifetime bounds are made explicit. With the given type for store, the argument for x could be something that has a lifetime that does not last as long as the lifetimes in the contents in the box.

trait Shape {}

struct Triangle {
    base: f64,
    height: f64,
}

impl Shape for Triangle {}

// The type parameter S is assigned no lifetime bound.
fn store<'a, S: Shape>(
    x: S,
    // The reference is assigned a fresh lifetime by rule
    // [lifetime-elision.function.implicit-lifetime-parameters].
    //
    // The trait object is assigned 'static by rule
    // [lifetime-elision.trait-object.default] and
    // [lifetime-elision.trait-object.innermost-type].
    data: &'a mut Box<dyn Shape + 'static>,
) {
    *data = Box::new(x);
}

// An example of how the implementation of store could be misused with
// the given type.
fn main() {
    let triangle = Triangle {
        base: 1.0,
        height: 2.0,
    };
    let mut b: Box<dyn Shape> = Box::new(triangle);
    {
        let short_lived_triangle = Triangle {
            base: 5.0,
            height: 10.0,
        };
        store(short_lived_triangle, &mut b);
    }
    // Here b contains a dangling reference.
}

For this specific case, the most general solution is to define a new lifetime parameter to bound both S and dyn Shape. The type parameter for the reference can be elided, because it will be assigned a fresh lifetime parameter.

#![allow(unused)]
fn main() {
trait Shape {}

// Note the common bound
// -----------------here-\
// ----------------------|---------------------------and here-\
//                       v                                    v
fn store<'s, S: Shape + 's>(x: S, data: &mut Box<dyn Shape + 's>) {
    *data = Box::new(x);
}
}

Enums

In C++, enums are often used to model a fixed set of alternatives, especially when each of those enumerators corresponds to a specific integer value, such as is needed when working with hardware, system calls, or protocol implementations.

For example, the various modes for a GPIO pin could be modeled as an enum, which would restrict methods using the mode to valid values.

While Rust enums are more general, they can still be used for this sort of modeling.

#include <cstdint>

enum Pin : uint8_t {
  Pin1 = 0x01,
  Pin2 = 0x02,
  Pin3 = 0x04
};

enum Mode : uint8_t {
  Output = 0x03,
  Pullup = 0x04,
  Analog = 0x27
  // ...
};

void low_level_set_pin(uint8_t pin, uint8_t mode);

void set_pin_mode(Pin pin, Mode mode) {
  low_level_set_pin(pin, mode);
}
#![allow(unused)]
fn main() {
#[repr(u8)]
#[derive(Clone, Copy)]
enum Pin {
    Pin1 = 0x01,
    Pin2 = 0x02,
    Pin3 = 0x04,
}

#[repr(u8)]
#[derive(Clone, Copy)]
enum Mode {
    Output = 0x03,
    Pullup = 0x04,
    Analog = 0x27,
    // ...
}

extern "C" {
    fn low_level_set_pin(pin: u8, mode: u8);
}

fn set_pin_mode(pin: Pin, mode: Mode) {
    unsafe {
        low_level_set_pin(pin as u8, mode as u8)
    };
}
}

The #[repr(u8)] attribute ensures that the representation of the enum is the same as a byte (like declaring the underlying type of an enum in C++). The enum values can then be freely converted to the underlying type with the as.

In C++ the standard way to convert from an integer to an enum is a static cast. However, this requires that the user check the validity of the cast themselves. Often the conversion is done by a function that checks that the value to convert is a valid enum value.

In Rust the standard way to perform the conversion is to implement the TryFrom trait for the type and then use the try_from method or try_into method.

#include <cstdint>

enum Pin : uint8_t {
  Pin1 = 0x01,
  Pin2 = 0x02,
  Pin3 = 0x04
};

struct InvalidPin {
    uint8_t pin;
};

Pin to_pin(uint8_t pin) {
  // The values are not contiguous, so we can't
  // just check the bounds and then cast.
  switch (pin) {
  case 0x1: { return Pin1; }
  case 0x2: { return Pin2; }
  case 0x4: { return Pin3; }
  }
  throw InvalidPin{pin};
}

int main() {
  try {
    Pin p(to_pin(2));
  } catch (InvalidPin &e) {
    return 0;
  }

  // use pin p
}
#[repr(u8)]
#[derive(Clone, Copy)]
enum Pin {
    Pin1 = 0x01,
    Pin2 = 0x02,
    Pin3 = 0x04,
}

use std::convert::TryFrom;

struct InvalidPin(u8);

impl TryFrom<u8> for Pin {
    type Error = InvalidPin;

    fn try_from(
        value: u8,
    ) -> Result<Self, Self::Error> {
        match value {
            0x01 => Ok(Pin::Pin1),
            0x02 => Ok(Pin::Pin2),
            0x04 => Ok(Pin::Pin3),
            pin => Err(InvalidPin(pin)),
        }
    }
}

fn main() {
  let Ok(p) = Pin::try_from(2) else {
    return;
  };

  // use pin p
}

See Exceptions and error handling for examples of how to ergonomically handle the result of try_from.

If low-level performance is more of a concern than memory safety, std::mem::transmute is analogous to a C++ reinterpret cast, but requires unsafe Rust because its use can result in undefined behavior. Uses of std::mem::transmute for this purpose should not be hidden behind an interface that can be called from safe Rust unless the interface can actually guarantee that the call will never happen with an invalid value.

Enums and methods

In C++ enums cannot have methods. Instead, to model an enum with methods one must define a wrapper class for the enum and define the methods on that wrapper class. In Rust, methods can be defined on an enum with an impl block, just like any other type.

#include <cstdint>

// Actual enum
enum PinImpl : uint8_t {
  Pin1 = 0x01,
  Pin2 = 0x02,
  Pin3 = 0x04
};

class LastPin{};

// Wrapper type
struct Pin {
  PinImpl pin;

  // Conversion constructor so that PinImpl can be
  // used as a Pin.
  Pin(PinImpl p) : pin(p) {}

  // Conversion method so wrapper type can be
  // used with switch statement.
  operator PinImpl() {
    return this->pin;
  }

  Pin next() const {
    switch (pin) {
    case Pin1:
      return Pin(Pin2);
    case Pin2:
      return Pin(Pin3);
    default:
      throw LastPin{};
    }
  }
};
#![allow(unused)]
fn main() {
#[repr(u8)]
#[derive(Clone, Copy)]
enum Pin {
    Pin1 = 0x01,
    Pin2 = 0x02,
    Pin3 = 0x04,
}

struct LastPin;

impl Pin {
    fn next(&self) -> Result<Self, LastPin> {
        match self {
            Pin::Pin1 => Ok(Pin::Pin2),
            Pin::Pin2 => Ok(Pin::Pin3),
            Pin::Pin3 => Err(LastPin),
        }
    }
}
}

Tagged unions and std::variant

C-style tagged unions

Because unions cannot be used for type punning in C++, they are usually used with a tag to discriminate between which variant of the union is active.

Rust's equivalent to union types are always tagged. They are a generalization of Rust enums, where additional data may be associated with the enum variants.

enum Tag { Rectangle, Triangle };

struct Shape {
  Tag tag;
  union {
    struct {
      double width;
      double height;
    } rectangle;
    struct {
      double base;
      double height;
    } triangle;
  };

  double area() {
    switch (this->tag) {
    case Rectangle: {
      return this->rectangle.width *
             this->rectangle.height;
    }
    case Triangle: {
      return 0.5 * this->triangle.base *
             this->triangle.height;
    }
    }
  }
};
#![allow(unused)]
fn main() {
enum Shape {
    Rectangle { width: f64, height: f64 },
    Triangle { base: f64, height: f64 },
}

impl Shape {
    fn area(&self) -> f64 {
        match self {
            Shape::Rectangle {
                width,
                height,
            } => width * height,
            Shape::Triangle { base, height } => {
                0.5 * base * height
            }
        }
    }
}
}

When matching on an enum, Rust requires that all variants of the enum be handled. In situations where default would be used with a C++ switch on the tag, a wildcard can be used in the Rust match.

#include <iostream>

enum Tag { Rectangle, Triangle, Circle };

struct Shape {
  Tag tag;
  union {
    struct {
      double width;
      double height;
    } rectangle;
    struct {
      double base;
      double height;
    } triangle;
    struct {
      double radius;
    } circle;
  };

  void print_shape() {
    switch (this->tag) {
    case Rectangle: {
      std::cout << "Rectangle" << std::endl;
      break;
    }
    default: {
      std::cout << "Some other shape"
                << std::endl;
      break;
    }
    }
  }
};
#![allow(unused)]
fn main() {
enum Shape {
    Rectangle { width: f64, height: f64 },
    Triangle { base: f64, height: f64 },
}

impl Shape {
    fn print_shape(&self) {
        match self {
            Shape::Rectangle { .. } => {
                println!("Rectangle");
            }
            _ => {
                println!("Some other shape");
            }
        }
    }
}
}

Rust does not support C++-style fallthrough where some behavior can be done before falling through to the next case. However, in Rust one can match on multiple enum variants simultaneously, so long as the simultaneous match patterns bind the same names with the same types.

#![allow(unused)]
fn main() {
enum Shape {
    Rectangle { width: f64, height: f64 },
    Triangle { base: f64, height: f64 },
}

impl Shape {
    fn bounding_area(&self) -> f64 {
        match self {
            Shape::Rectangle { height, width }
            | Shape::Triangle {
                height,
                base: width,
            } => width * height,
        }
    }
}
}

Accessing the value without checking the discriminant

Unlike with C-style unions, Rust always requires matching on the discriminant before accessing the values. If the variant is already known, e.g., due to an earlier check, then the code can usually be refactored to encode the knowledge in the type so that the second check (and corresponding error handling) can be omitted.

A C++ program like the following requires more restructuring of the types to achieve the same goal in Rust.

The corresponding Rust program requires defining separate types for each variant of the Shape enum so that the fact that all of the value are of a given type can be expressed in the type system by having an array of Triangle instead of an array of Shape.

#include <ranges>
#include <vector>

// Uses the same Shape definition.
enum Tag { Rectangle, Triangle };

struct Shape {
  Tag tag;
  union {
    struct {
      double width;
      double height;
    } rectangle;
    struct {
      double base;
      double height;
    } triangle;
  };
};

std::vector<Shape> get_shapes() {
  return std::vector<Shape>{
      Shape{Triangle, {.triangle = {1.0, 1.0}}},
      Shape{Triangle, {.triangle = {1.0, 1.0}}},
      Shape{Rectangle, {.rectangle = {1.0, 1.0}}},
  };
}

std::vector<Shape> get_shapes();

int main() {
  std::vector<Shape> shapes = get_shapes();

  auto is_triangle = [](Shape shape) {
    return shape.tag == Triangle;
  };

  // Create an iterator that only sees the
  // triangles. (std::views::filter is from C++20,
  // but the same effect can be acheived with a
  // custom iterator.)
  auto triangles =
      shapes | std::views::filter(is_triangle);

  double total_base = 0.0;
  for (auto &triangle : triangles) {
    // Skip checking the tag because we know we
    // have only triangles.
    total_base += triangle.triangle.base;
  }

  return 0;
}
// Define a separate struct for each variant.
struct Rectangle { width: f64, height: f64 }
struct  Triangle { base: f64, height: f64 }

enum Shape {
    Rectangle(Rectangle),
    Triangle(Triangle),
}

fn get_shapes() -> Vec<Shape> {
    vec![
        Shape::Triangle(Triangle {
            base: 1.0,
            height: 1.0,
        }),
        Shape::Triangle(Triangle {
            base: 1.0,
            height: 1.0,
        }),
        Shape::Rectangle(Rectangle {
            width: 1.0,
            height: 1.0,
        }),
    ]
}

fn main() {
    let shapes = get_shapes();

    // This iterator only iterates over triangles
    // and demonstrates that by iterating over
    // the Triangle type instead of the Shape type.
    let triangles = shapes
        .iter()
        // Keep only the triangles
        .filter_map(|shape| match shape {
            Shape::Triangle(t) => Some(t),
            _ => None,
        });

    let mut total_base = 0.0;
    for triangle in triangles {
        // Because the iterator produces Triangles
        // instead of Shapes, base can be accessed
        // directly.
        total_base += triangle.base;
    }
}

This kind of use is common enough in Rust that the variants are often designed to have their own types from the start.

This approach is also possible in C++. It is more commonly used along with std::variant in C++17 or later.

std::variant (since C++17)

When programming in C++ standards since C++17, std::variant can be used to represent a tagged union in a way that has more in common with Rust enums.

#include <variant>

struct Rectangle {
  double width;
  double height;
};

struct Triangle {
  double base;
  double height;
};

using Shape = std::variant<Rectangle, Triangle>;

double area(const Shape &shape) {
  return std::visit(
      [](auto &&arg) -> double {
        using T = std::decay_t<decltype(arg)>;
        if constexpr (std::is_same_v<T, Rectangle>) {
          return arg.width * arg.height;
        } else if constexpr (std::is_same_v<T, Triangle>) {
          return 0.5 * arg.base * arg.height;
        }
      },
      shape);
}

Because Rust doesn't depend on templates for this language feature, error messages when a variant is missed or when a new variant is added are easier to read, which removes one of the barriers to using tagged unions more frequently. Compare the errors in C++ (using gcc) and Rust when the Triangle case is omitted.

The following two programs have the same error: each fails to handle a case of Shape.

#include <variant>

struct Rectangle {
  double width;
  double height;
};

struct Triangle {
  double base;
  double height;
};

using Shape = std::variant<Rectangle, Triangle>;

double area(const Shape &shape) {
  return std::visit(
      [](auto &&arg) -> double {
        using T = std::decay_t<decltype(arg)>;
        if constexpr (std::is_same_v<T, Rectangle>) {
          return arg.width * arg.height;
        }
      },
      shape);
}
enum Shape {
    Rectangle { width: f64, height: f64 },
    Triangle { base: f64, height: f64 },
}

impl Shape {
    fn area(&self) -> f64 {
        match self {
            Shape::Rectangle {
                width,
                height,
            } => width * height,
        }
    }
}

However, the error messages differ significantly.

example.cc: In instantiation of ‘area(const Shape&)::<lambda(auto:27&&)> [with auto:27 = const Triangle&]’:
/usr/include/c++/14.2.1/bits/invoke.h:61:36:   required from ‘constexpr _Res std::__invoke_impl(__invoke_other, _Fn&&, _Args&& ...) [with _Res = double; _Fn = area(const Shape&)::<lambda(auto:27&&)>; _Args = {const Triangle&}]’
   61 |     { return std::forward<_Fn>(__f)(std::forward<_Args>(__args)...); }
      |              ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/c++/14.2.1/bits/invoke.h:96:40:   required from ‘constexpr typename std::__invoke_result<_Functor, _ArgTypes>::type std::__invoke(_Callable&&, _Args&& ...) [with _Callable = area(const Shape&)::<lambda(auto:27&&)>; _Args = {const Triangle&}; typename __invoke_result<_Functor, _ArgTypes>::type = double]’
   96 |       return std::__invoke_impl<__type>(__tag{}, std::forward<_Callable>(__fn),
      |              ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   97 |                                         std::forward<_Args>(__args)...);
      |                                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/c++/14.2.1/variant:1060:24:   required from ‘static constexpr decltype(auto) std::__detail::__variant::__gen_vtable_impl<std::__detail::__variant::_Multi_array<_Result_type (*)(_Visitor, _Variants ...)>, std::integer_sequence<long unsigned int, __indices ...> >::__visit_invoke(_Visitor&&, _Variants ...) [with _Result_type = std::__detail::__variant::__deduce_visit_result<double>; _Visitor = area(const Shape&)::<lambda(auto:27&&)>&&; _Variants = {const std::variant<Rectangle, Triangle>&}; long unsigned int ...__indices = {1}]’
 1060 |           return std::__invoke(std::forward<_Visitor>(__visitor),
      |                  ~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1061 |               __element_by_index_or_cookie<__indices>(
      |               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1062 |                 std::forward<_Variants>(__vars))...);
      |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/c++/14.2.1/variant:1820:5:   required from ‘constexpr decltype(auto) std::__do_visit(_Visitor&&, _Variants&& ...) [with _Result_type = __detail::__variant::__deduce_visit_result<double>; _Visitor = area(const Shape&)::<lambda(auto:27&&)>; _Variants = {const variant<Rectangle, Triangle>&}]’
 1820 |                   _GLIBCXX_VISIT_CASE(1)
      |                   ^~~~~~~~~~~~~~~~~~~
/usr/include/c++/14.2.1/variant:1882:34:   required from ‘constexpr std::__detail::__variant::__visit_result_t<_Visitor, _Variants ...> std::visit(_Visitor&&, _Variants&& ...) [with _Visitor = area(const Shape&)::<lambda(auto:27&&)>; _Variants = {const variant<Rectangle, Triangle>&}; __detail::__variant::__visit_result_t<_Visitor, _Variants ...> = double]’
 1882 |             return std::__do_visit<_Tag>(
      |                    ~~~~~~~~~~~~~~~~~~~~~^
 1883 |               std::forward<_Visitor>(__visitor),
      |               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1884 |               static_cast<_Vp>(__variants)...);
      |               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
example.cc:17:20:   required from here
   17 |   return std::visit(
      |          ~~~~~~~~~~^
   18 |       [](auto &&arg) -> double {
      |       ~~~~~~~~~~~~~~~~~~~~~~~~~~
   19 |         using T = std::decay_t<decltype(arg)>;
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   20 |         if constexpr (std::is_same_v<T, Rectangle>) {
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   21 |           return arg.width * arg.height;
      |           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   22 |         }
      |         ~
   23 |       },
      |       ~~
   24 |       shape);
      |       ~~~~~~
example.cc:23:7: error: no return statement in ‘constexpr’ function returning non-void
   23 |       },
      |       ^
example.cc: In lambda function:
example.cc:23:7: warning: control reaches end of non-void function [-Wreturn-type]
error[E0004]: non-exhaustive patterns: `&Shape::Triangle { .. }` not covered
 --> example.rs:8:15
  |
8 |         match self {
  |               ^^^^ pattern `&Shape::Triangle { .. }` not covered
  |
note: `Shape` defined here
 --> example.rs:1:6
  |
1 | enum Shape {
  |      ^^^^^
2 |     Rectangle { width: f64, height: f64 },
3 |     Triangle { base: f64, height: f64 },
  |     -------- not covered
  = note: the matched value is of type `&Shape`
help: ensure that all possible cases are being handled by adding a match arm with a wildcard pattern or an explicit pattern as shown
  |
12~             } => width * height,
13~             &Shape::Triangle { .. } => todo!(),
  |

Using unsafe Rust to avoid checking the discriminant

In situations where rewriting code to use the above approach is not possible, one can check the discriminant anyway and then use the unreachable! macro to avoid handling the impossible case. However, that still involves actually checking the discriminant. If the cost of checking the discriminant must be avoided, then the unsafe function unreachable_unchecked can be used to both avoid handling the case and to indicate to the compiler that the optimizer should assume that the case cannot be reached, so the discriminant check can be optimized away.

Much like how in the C++ example accessing an inactive variant is undefined behavior, reaching unreachable_unchecked is also undefined behavior.

enum Shape {
    Rectangle { width: f64, height: f64 },
    Triangle { base: f64, height: f64 },
}

impl Shape {
    fn area(&self) -> f64 {
        match self {
            Shape::Rectangle {
                width,
                height,
            } => width * height,
            Shape::Triangle { base, height } => {
                0.5 * base * height
            }
        }
    }
}

fn get_triangles() -> Vec<Shape> {
    vec![
        Shape::Triangle {
            base: 1.0,
            height: 1.0,
        },
        Shape::Triangle {
            base: 1.0,
            height: 1.0,
        },
    ]
}

use std::hint::unreachable_unchecked;

fn main() {
    let mut total_base = 0.0;
    for triangle in get_triangles() {
        match triangle {
            Shape::Triangle { base, .. } => {
                total_base += base;
            }
            _ => unsafe {
                unreachable_unchecked();
            },
        }
    }
}

Inheritance and implementation reuse

Rust does not have inheritance and so the primary means of reuse of implementations in Rust are composition, aggregation, and generics.

However, Rust traits do have support for default methods which resemble one simple case of using inheritance for reuse of implementations. For example, in the following example two virtual methods are used to support a method whose implementation is provided by the abstract class.

#include <iostream>
#include <string>

class Device {
public:
    virtual void powerOn() = 0;
    virtual void powerOff() = 0;

    virtual void resetDevice() {
        std::cout << "Resetting device..." << std::endl;
        powerOff();
        powerOn();
    }

    virtual ~Device() {}
};

class Printer : public Device {
    bool powered = false;
public:
    void powerOn() override {
        this.powered = true;
        std::cout << "Printer is powered on." << std::endl;
    }

    void powerOff() override {
        this.powered = false;
        std::cout << "Printer is powered off." << std::endl;
    }
};

int main() {
    Printer myPrinter;
    myPrinter.resetDevice();
}
trait Device {
    fn power_on(&mut self);
    fn power_off(&mut self);

    fn reset_device(&mut self) {
        println!("Resetting device...");
        self.power_on();
        self.power_off();
    }
}

struct Printer {
    powered: bool,
}

impl Printer {
    fn new() -> Printer {
        Printer { powered: false }
    }
}

impl Device for Printer {
    fn power_on(&mut self) {
        self.powered = true;
        println!("Printer is powered on");
    }

    fn power_off(&mut self) {
        self.powered = false;
        println!("Printer is powered off");
    }
}

fn main() {
    let mut p = Printer::new();
    p.reset_device();
}

In practice, the resetDevice() method in the Device class might be made non-virtual in C++ if it is not expected that it will be overridden. In order to make it align with the Rust example, we have made it virtual here, since Rust traits can be used either for dynamic dispatch or static dispatch (with no vtable overhead in the static dispatch case).

Rust traits differ from abstract classes in few more ways. For example, Rust traits cannot define data members and cannot define private or protected methods. This limits the effectiveness of using traits to implement the template method pattern.

Rust traits also cannot be privately implemented. Anywhere that both a trait and a type that implements that trait are visible, the methods of the trait are visible as methods on the type.

Traits can, however, inherit from each other, including multiple inheritance. As in modern C++, inheritance hierarchies in Rust tend to be shallow. In situations with complex multiple inheritance, however, the diamond problem cannot arise in Rust because traits cannot override other traits implementations. Therefore, all paths to a common parent trait resolve to the same implementation.

Template classes, functions, and methods

The most common uses of templates in C++ are to define classes, methods, traits, or functions that work for any type (or at least for any type that provides certain methods). This use case is common in the STL for container classes (such as <vector>) and for the algorithms library (<algorithm>).

The following example defines a template for a directed graph represented as an adjacency list, where the graph is generic in the type of the labels on the nodes. Though the example shows a template class, the same comparisons with Rust apply to template methods and template functions.

The same kind of reusable code can be created in Rust using generic types.

#include <stdexcept>
#include <vector>

template <typename Label>
class DirectedGraph {
  std::vector<std::vector<size_t>> adjacencies;
  std::vector<Label> nodeLabels;

public:
  size_t addNode(Label label) {
    adjacencies.push_back(std::vector<size_t>());
    nodeLabels.push_back(label);
    return numNodes() - 1;
  }

  void addEdge(size_t from, size_t to) {
    size_t numNodes = this->numNodes();
    if (from >= numNodes || to >= numNodes) {
      throw std::invalid_argument(
          "Node index out of range");
    }
    adjacencies[from].push_back(to);
  }

  size_t numNodes() const {
    return adjacencies.size();
  }
};
#![allow(unused)]
fn main() {
pub struct DirectedGraph<Label> {
    adjacencies: Vec<Vec<usize>>,
    node_labels: Vec<Label>,
}

impl<Label> DirectedGraph<Label> {
    pub fn new() -> Self {
        DirectedGraph {
            adjacencies: Vec::new(),
            node_labels: Vec::new(),
        }
    }

    pub fn add_node(
        &mut self,
        label: Label,
    ) -> usize {
        self.adjacencies.push(Vec::new());
        self.node_labels.push(label);
        self.num_nodes() - 1
    }

    pub fn add_edge(
        &mut self,
        from: usize,
        to: usize,
    ) -> Result<(), &str> {
        let num_nodes = self.num_nodes();
        if from >= num_nodes || to >= num_nodes {
            Err("Node index out of range.")
        } else {
            self.adjacencies[from].push(to);
            Ok(())
        }
    }

    pub fn num_nodes(&self) -> usize {
        self.node_labels.len()
    }
}
}

In the use case demonstrated in the above example, there are few practical differences between using C++ template to define a class and using and Rust's generics to define a struct. Whenever one would use a template that takes a typename or class parameter in C++, one can instead take a type parameter in Rust.

Operations on the parameterized type

The differences become more apparent when one attempts to perform operations on the values. The following code listing adds a method to get the smallest node in the graph to both the Rust and the C++ examples.

#include <optional>
#include <stdexcept>
#include <vector>

template <typename Label>
class DirectedGraph {
  std::vector<std::vector<size_t>> adjacencies;
  std::vector<Label> nodeLabels;

public:
  size_t addNode(Label label) {
    adjacencies.push_back(std::vector<size_t>());
    nodeLabels.push_back(label);
    return numNodes() - 1;
  }

  void addEdge(size_t from, size_t to) {
    size_t numNodes = this->numNodes();
    if (from >= numNodes || to >= numNodes) {
      throw std::invalid_argument(
          "Node index out of range");
    }
    adjacencies[from].push_back(to);
  }

  size_t numNodes() const {
    return adjacencies.size();
  }

  std::optional<size_t> smallestNode() {
    if (nodeLabels.empty()) {
      return std::nullopt;
    }
    Label &least = nodeLabels[0];
    size_t index = 0;

    for (int i = 1; i < nodeLabels.size(); i++) {
      if (least > nodeLabels[i]) {
        least = nodeLabels[i];
        index = i;
      }
    }
    return std::optional(index);
  }
};
#![allow(unused)]
fn main() {
pub struct DirectedGraph<Label> {
    adjacencies: Vec<Vec<usize>>,
    node_labels: Vec<Label>,
}

impl<Label> DirectedGraph<Label> {
    pub fn new() -> Self {
        DirectedGraph {
            adjacencies: Vec::new(),
            node_labels: Vec::new(),
        }
    }

    pub fn add_node(
        &mut self,
        label: Label,
    ) -> usize {
        self.adjacencies.push(Vec::new());
        self.node_labels.push(label);
        self.num_nodes() - 1
    }

    pub fn num_nodes(&self) -> usize {
        self.node_labels.len()
    }

    pub fn add_edge(
        &mut self,
        from: usize,
        to: usize,
    ) -> Result<(), &str> {
        if from > self.num_nodes()
            || to > self.num_nodes()
        {
            Err("Node not in graph.")
        } else {
            self.adjacencies[from].push(to);
            Ok(())
        }
    }
    pub fn smallest_node(&self) -> Option<usize>
    where
        Label: Ord,
    {
        // Matches the C++, but is not the idomatic
        // implementation!
        if self.node_labels.is_empty() {
            None
        } else {
            let mut least = &self.node_labels[0];
            let mut index = 0;
            for i in 1..self.node_labels.len() {
                if *least > self.node_labels[i] {
                    least = &self.node_labels[i];
                    index = i;
                }
            }
            Some(index)
        }
    }
}
}

The major difference between these implementations is that in the C++ version operator> is used on the values without knowing whether the operator is defined for the type. In the Rust version, there is a constraint requiring that the Label type implement the Ord trait. (See the chapter on concepts, interfaces, and static dispatch for more details on Rust traits and how they relate to C++ concepts.)

Unlike C++ templates, generic definitions in Rust are type checked at the point of definition rather than at the point of use. This means that for operations to be used on values with the type of a type parameter, the parameter has to be constrained to types that implement some trait. As can be seen in the above example, much like with C++ concepts and requires, the constraint can be required for individual methods rather than for the whole generic class.

It is best practice in Rust to put the trait bounds on the specific things that require the bounds, in order to make the overall use of the types more flexible.

As an aside, a more idiomatic implementation of smallest_node makes use of Rust's iterators. This style of implementation may take some getting used to for programmers more accustomed to implementations in the style used in the earlier example.

#![allow(unused)]
fn main() {
pub struct DirectedGraph<Label> {
    adjacencies: Vec<Vec<usize>>,
    node_labels: Vec<Label>,
}

impl<Label> DirectedGraph<Label> {
    pub fn new() -> Self {
        DirectedGraph {
            adjacencies: Vec::new(),
            node_labels: Vec::new(),
        }
    }

    pub fn add_node(
        &mut self,
        label: Label,
    ) -> usize {
        self.adjacencies.push(Vec::new());
        self.node_labels.push(label);
        self.num_nodes() - 1
    }

    pub fn num_nodes(&self) -> usize {
        self.node_labels.len()
    }

    pub fn add_edge(
        &mut self,
        from: usize,
        to: usize,
    ) -> Result<(), &str> {
        if from > self.num_nodes()
            || to > self.num_nodes()
        {
            Err("Node not in graph.")
        } else {
            self.adjacencies[from].push(to);
            Ok(())
        }
    }
    pub fn smallest_node(&self) -> Option<usize>
    where
        Label: Ord,
    {
        self.node_labels
            .iter()
            .enumerate()
            .map(|(i, l)| (l, i))
            .min()
            .map(|(_, i)| i)
    }
}
}

An even more idiomatic implementation would make use of the itertools crate.

use itertools::*;

pub struct DirectedGraph<Label> {
    adjacencies: Vec<Vec<usize>>,
    node_labels: Vec<Label>,
}

impl<Label> DirectedGraph<Label> {
    pub fn new() -> Self {
        DirectedGraph {
            adjacencies: Vec::new(),
            node_labels: Vec::new(),
        }
    }

    pub fn add_node(
        &mut self,
        label: Label,
    ) -> usize {
        self.adjacencies.push(Vec::new());
        self.node_labels.push(label);
        self.num_nodes() - 1
    }

    pub fn num_nodes(&self) -> usize {
        self.node_labels.len()
    }

    pub fn add_edge(
        &mut self,
        from: usize,
        to: usize,
    ) -> Result<(), &str> {
        if from > self.num_nodes()
            || to > self.num_nodes()
        {
            Err("Node not in graph.")
        } else {
            self.adjacencies[from].push(to);
            Ok(())
        }
    }

    pub fn smallest_node(&self) -> Option<usize>
    where
        Label: Ord,
    {
        self.node_labels.iter().position_min()
    }
}

constexpr template parameters

Rust also supports the equivalent of constexpr template parameters. For example, one can define a generic function that returns an array consecutive integers starting from a specific value and whose size is determined at compile time.

#include <array>
#include <cstddef>

template <size_t N>
std::array<int, N>
makeSequentialArray(int start) {
  std::array<int, N> arr;
  for (size_t i = 0; i < N; i++) {
    arr[i] = start + i;
  }
}
#![allow(unused)]
fn main() {
fn make_sequential_array<const N: usize>(
    start: i32,
) -> [i32; N] {
    std::array::from_fn(|i| start + i as i32)
}
}

The corresponding idiomatic Rust function uses the helper std::array::from_fn to construct the array. from_fn itself takes as type parameters the element type and the constant. Those arguments are elided because Rust can infer them, because both are part of the type of the produced array.

Rust's Self type

Within a Rust struct defintion, impl block, or impl trait block, there is a Self type that is in scope. The Self type is the type of the class being defined with all of the generic type parameters filled in. It can be useful to refer to this type especially in cases where there are many parameters that would otherwise have to be listed out.

The Self type is necessary when defining generic traits to refer to the concrete implementing type. Because Rust does not have inheritance between concrete types and does not have method overriding, this is sufficient to avoid the need to pass the implementing type as a type parameter.

For examples of this, see the chapter on the curiously reoccurring template pattern.

A note on type checking and type errors

The checking of generic types at the point of definition rather than at the point of template expansion impacts when errors are detected and how they are reported. Some of this difference cannot be achieved by consistently using C++ concepts to declare the operations required.

For example, one might accidentally make the nodeLabels member a vector of size_t instead of a vector of the label parameter. If all of the test cases for the graph used label types that were convertible to integers, the error would not be detected.

A similar Rust program fails to compile, even without a function that instantiates the generic structure with a concrete type.

#include <stdexcept>
#include <vector>

template <typename Label>
class DirectedGraph {
  // The mistake is here: size_t should be Label
  std::vector<std::vector<size_t>> adjacencies;
  std::vector<size_t> nodeLabels;

public:
  Label getNode(size_t nodeId) {
    return nodeLabels[nodeId];
  }

  size_t addNode(Label label) {
    adjacencies.push_back(std::vector<size_t>());
    nodeLabels.push_back(label);
    return numNodes() - 1;
  }

  size_t numNodes() const {
    return adjacencies.size();
  }
};

#define BOOST_TEST_MODULE DirectedGraphTests
#include <boost/test/included/unit_test.hpp>

BOOST_AUTO_TEST_CASE(test_add_node_int) {
  DirectedGraph<int> g;
  auto n1 = g.addNode(1);
  BOOST_CHECK_EQUAL(1, g.getNode(n1));
}

BOOST_AUTO_TEST_CASE(test_add_node_float) {
  DirectedGraph<float> g;
  float label = 1.0f;
  auto n1 = g.addNode(label);
  BOOST_CHECK_CLOSE(label, g.getNode(n1), 0.0001);
}
pub struct DirectedGraph<Label> {
    // The mistake is here: size_t should be Label
    adjacencies: Vec<Vec<usize>>,
    node_labels: Vec<usize>,
}

impl<Label> DirectedGraph<Label> {
    pub fn new() -> Self {
        DirectedGraph {
            adjacencies: Vec::new(),
            node_labels: Vec::new(),
        }
    }

    pub fn get_node(
        &self,
        node_id: usize,
    ) -> Option<&Label> {
        self.node_labels.get(node_id)
    }

    pub fn add_node(
        &mut self,
        label: Label,
    ) -> usize {
        self.adjacencies.push(Vec::new());
        self.node_labels.push(label);
        self.num_nodes() - 1
    }

    pub fn num_nodes(&self) -> usize {
        self.node_labels.len()
    }
}

Despite the error, the C++ example compiles and passes the tests.

Running 2 test cases...

*** No errors detected

Even without test cases, the Rust example fails to compile and produces a message useful for identifying the error.

error[E0308]: mismatched types
    --> example.rs:26:31
     |
6    | impl<Label> DirectedGraph<Label> {
     |      ----- found this type parameter
...
26   |         self.node_labels.push(label);
     |                          ---- ^^^^^ expected `usize`, found type parameter `Label`
     |                          |
     |                          arguments to this method are incorrect
     |
     = note:        expected type `usize`
             found type parameter `Label`

Lifetimes parameters

Rust's generics are also used for classes, methods, traits, and functions that are generic in the lifetimes of the references they manipulate. Unlike other type parameters, the using a function with different lifetimes does not cause additional copies of the function to be generated in the compiled code, because lifetimes do not impact the runtime representation.

The chapter on concepts includes examples of how lifetimes interact with Rust's generics.

Conditional compilation

One significant difference between C++ templates and Rust generics is that C++ templates are actually a more general purpose macro language, supporting things like conditional compilation (e.g., when used in conjunction with if constexpr, requires, or std::enable_if). Rust supports these use cases with its macro system, which differs significantly from C++. The most common use of the macro system, conditional compilation, is provided by the cfg attribute and cfg! macro.

The separation of conditional compilation from generics in Rust involves similar design considerations as the omission of template specialization from Rust.

Template specialization

Template specialization in C++ makes it possible for a template entity to have different implementations for different parameters. Most STL implementations make use of this to, for example, provide a space-efficient representation of std::vector<bool>.

Because of the possibility of template specialization, when a C++ function operates on values of a template class like std::vector, the function is essentially defined in terms of the interface provided by the template class, rather than for a specific implementation.

To accomplish the same thing in Rust requires defining the function in terms of a trait for the interface against which it operates. This enables clients to select their choice of representation for data by using any concrete type that implements the interface.

This is more practical to do in Rust than in C++, because generics not being a general metaprogramming facility means that generic entities can be type checked locally, making them easier to define. It is more common to do in Rust than in C++ because Rust does not have implementation inheritance, so there is a sharper line between interface and implementation than there is in C++.

The following example shows how a Rust function can be implemented so that different concrete representations can be selected by a client. For a compact bit vector representation, the example uses the BitVec type from the bitvec crate. BitVec is intended intended to provide an API similar to Vec<bool> or std::vector<bool>.

#include <string>
#include <vector>

template <typename T>
void push_if_even(int n,
                  std::vector<T> &collection,
                  T item) {
  if (n % 2 == 0) {
    collection.push_back(item);
  }
}

int main() {
  // Operate on the default std::vector
  // implementation
  std::vector<std::string> v{"a", "b"};
  push_if_even(2, v, std::string("c"));

  // Operate on the (likely space-optimized)
  // std::vector implementation
  std::vector<bool> bv{false, true};
  push_if_even(2, bv, false);
}
// The Extend trait is for types that support
// appending values to the collection.
fn push_if_even<T, I: Extend<T>>(
    n: u32,
    collection: &mut I,
    item: T,
) {
    if n % 2 == 0 {
        collection.extend([item]);
    }
}

use bitvec::prelude::*;

fn main() {
    // Operate on Vec
    let mut v =
        vec!["a".to_string(), "b".to_string()];
    push_if_even(2, &mut v, "c".to_string());

    // Operate on BitVec
    let mut bv = bitvec![0, 1];
    push_if_even(2, &mut bv, 0);
}

Trade-offs between generics and templates

Because generic functions can only interact with generic values in ways defined by the trait bounds, it is easier to test generic implementations. In particular, code testing a generic implementation only has to consider the possible behaviors of the given trait.

For a comparison, consider the following programs.

template <totally_ordered T>
T max(const T &x, const T &y) {
  return (x > y) ? x : y;
}

template <>
int max(const int &x, const int &y) {
  return (x > y) ? x + 1 : y + 1;
}
#![allow(unused)]
fn main() {
fn max<'a, T: Ord>(x: &'a T, y: &'a T) -> &'a T {
    if x > y {
        x
    } else {
        y
    }
}
}

In the Rust program, parametricity means that (assuming safe Rust) from the type alone one can tell that if the function returns, it must return exactly one of x or y. This is because the trait bound Ord doesn't give any way to construct new values of type T, and the use of references doesn't give any way for the function to store one of x or y from an earlier call to return in a later call.

In the C++ program, a call to max with int as the template parameter will give a distinctly different result than with any other parameter because of the template specialization enabling the behavior of the function to vary based on the type.

The trade-off is that in Rust specialized implementations are harder to use because they must have different names, but that they are easier to write because it is easier to write generic code while being confident about its correctness.

Niche optimization

There are several cases where the Rust compiler will perform optimizations to achieve more efficient representations. Those situations are all ones where the efficiency gains do not otherwise change the observable behavior of the code.

The most common case is with the Option type. When Option is used with a type where the compiler can tell that there are unused values, one f those unused values will be used to represent the None case, so that Option<T> will not require an extra word of memory to indicate the discriminant of the enum.

This optimization is applied to reference types (& and &mut), since references cannot be null. It is also applied to NonNull<T>, which represents a non-null pointer to a value of type T, and to NonZeroU8 and other non-zero integral types. The optimization for the reference case is what makes Option<&T> and Option<&mut T> safer equivalents to using non-owning observation pointers in C++.

Null (nullptr)

This section covers idiomatic uses of nullptr in C++ and how to achieve the same results in Rust.

Some uses of nullptr in C++ don't arise in the first place in Rust because of other language differences. For example, moved objects don't leave anything behind that needs to be destroyed. Therefore there is no need to use nullptr as a placeholder for a moved pointer that can have delete or free called on it.

Other uses are replaced by Option, which in safe Rust requires checking for the empty case before accessing the contained value. This use is common enough that Rust has an optimization for when Option is used with a reference (& or &mut ref), Box (equivalent of unique_ptr), and NonNull (a non-null raw pointer).

Sentinel values

Sentinel values are in-band value that indicates a special situation, such as having reached the end of valid data in an iterator.

nullptr

Many designs in C++ borrow the convention from C of using a null pointer as a sentinel value for a method that returns owned pointers. For example, a method that parses a large structure may produce std::nullptr in the case of failure.

A similar situation in Rust would make use of the type Option<Box<LargeStructure>>.

#include <memory>

class LargeStructure {
  int field;
  // many fields ...
};

std::unique_ptr<LargeStructure>
parse(char *data, size_t len) {
  // ...

  // on failure
  return nullptr;
}
#![allow(unused)]
fn main() {
struct LargeStructure {
    field: i32,
    // many fields ...
}

fn parse(
    data: &[u8],
) -> Option<Box<LargeStructure>> {
    // ...

    // on failure
    None
}
}

The Box<T> type has the same meaning as std::unique_ptr<T> in terms of being an uniquely owned pointer to some T on the heap, but unlike std::unique_ptr, it cannot be null. Rust's Option<T> is like std::optional<T> in C++, except that it can be used with pointers and references. In those cases (and in some other cases) the compiler optimizes the representation to be the same size as Box<T> by leveraging the fact that Box cannot be null.

In Rust it is also common to pay the cost for the extra byte to use a return type of Result<T, E> (which is akin to std::expected in C++23) in order to make the reason for the failure available at runtime.

Integer sentinels

When a possibly-failing function produces an integer, it is also common to use an otherwise unused or unlikely integer value as a sentinel value, such as 0 or INT_MAX.

In Rust, the Option type is used for this purpose. In cases where the zero value really is not possible to produce, as with the gcd algorithm above, the type NonZero<T> can be used to indicate that fact. As with Option<Box<T>>, the compiler optimizes the representation to make use of the unused value (in this case 0) to represent the None case to ensure that the representation of Option<NonZero<T>> is the same as the representation of Option<T>.

#include <algorithm>

int gcd(int a, int b) {
  if (b == 0 || a == 0) {
    // returns 0 to indicate invalid input
    return 0;
  }

  while (b != 0) {
    int temp = b;
    b = a % b;
    a = temp;
  }
  return std::abs(a);
}
use std::num::NonZero;

fn gcd(
    mut a: i32,
    mut b: i32,
) -> Option<NonZero<i32>> {
    if a == 0 || b == 0 {
        return None;
    }

    while b != 0 {
        let temp = b;
        b = a % b;
        a = temp;
    }
    // At this point, a is guaranteed to not be
    // zero. The `Some` case from `NonZero::new`
    // has a different meaning than the `Some`
    // returned from this function, but here it
    // happens to coincide.
    NonZero::new(a.abs())
}

fn main() {
    assert!(gcd(5, 0) == None);
    assert!(gcd(0, 5) == None);
    assert!(gcd(5, 1) == NonZero::new(1));
    assert!(gcd(1, 5) == NonZero::new(1));
    assert!(gcd(2 * 2 * 3 * 5 * 7, 2 * 2 * 7 * 11) == NonZero::new(2 * 2 * 7));
    assert!(gcd(2 * 2 * 7 * 11, 2 * 2 * 3 * 5 * 7) == NonZero::new(2 * 2 * 7));
}

As an aside, it is also possible to avoid the redundant check for zero at the end, and without using unsafe Rust, by preserving the non-zeroness property throughout the algorithm.

use std::num::NonZero;

fn gcd(x: i32, mut b: i32) -> Option<NonZero<i32>> {
    if b == 0 {
        return None;
    }

    // a is guaranteed to be non-zero, so we record the fact in the type of a.
    let mut a = NonZero::new(x)?;

    while let Some(temp) = NonZero::new(b) {
        b = a.get() % b;
        a = temp;
    }
    Some(a.abs())
}

fn main() {
    assert!(gcd(5, 0) == None);
    assert!(gcd(0, 5) == None);
    assert!(gcd(5, 1) == NonZero::new(1));
    assert!(gcd(1, 5) == NonZero::new(1));
    assert!(gcd(2 * 2 * 3 * 5 * 7, 2 * 2 * 7 * 11) == NonZero::new(2 * 2 * 7));
    assert!(gcd(2 * 2 * 7 * 11, 2 * 2 * 3 * 5 * 7) == NonZero::new(2 * 2 * 7));
}

std::optional

In situations where std::optional would be used as a sentinel value in C++, Option can be used for the same purpose in Rust. The main difference between the two is that safe Rust requires either explicitly checking whether the value is None, while in C++ one can attempt to access the value without checking (at the risk of undefined behavior).

Moved members

Moving values out of variables or fields in Rust is more explicit than it is in C++. A value that might be moved with nothing left behind needs to be represented using an Option<Box<T>> type in Rust, while in C++ it would just be a std::unique_ptr<T>.

#include <memory>

void readMailbox(std::unique_ptr<int> &mailbox,
                 std::mutex mailboxMutex) {
  std::lock_guard<std::mutex> guard(mailboxMutex);

  if (!mailbox) {
    return;
  }
  int x = *mailbox;
  mailbox = nullptr;
  // use x
}
#![allow(unused)]
fn main() {
use std::sync::Arc;
use std::sync::Mutex;

fn read(mailbox: Arc<Mutex<Option<i32>>>) {
    let Ok(mut x) = mailbox.lock() else {
        return;
    };
    let x = x.take();
    // use x
}
}

Additionally, when taking ownership of a value from within a mutable reference, something has to be left in its place. This can be done using std::mem::swap, and many container-like types have methods for making common ownership-swapping more ergonomic, like Option::take as seen in the earlier example, Option::replace or Vec::swap.

Deleting moved objects

Another common use of null pointers in modern C++ is as values for the members of moved objects so that the destructor can still safely be called. E.g.,

#include <cstdlib>
#include <cstring>

// widget.h
struct widget_t;
widget_t *alloc_widget();
void free_widget(widget_t*);
void copy_widget(widget_t* dst, widget_t* src);

// widget.cc
class Widget {
    widget_t* widget;
public:
    Widget() : widget(alloc_widget()) {}

    Widget(const Widget &other) : widget(alloc_widget()) {
        copy_widget(widget, other.widget);
    }

    Widget(Widget &&other) : widget(other.widget) {
        other.widget = nullptr;
    }

    ~Widget() {
        free_widget(widget);
    }
};

Rust's notion of moving objects does not involve leaving behind an object on which a destructor will be called, and so this use of null does not have a corresponding idiom. See the chapter on copy and move constructors for more details.

Zero-length arrays

In C++ codebases that are written in a C style or that make use of C libraries, null pointers may be used to represent empty arrays.

In Rust, arrays of arbitrary size are represented as slices. These slices can have zero length. Since Rust vectors are convertible to slices, defining functions that work with slices enables them to be used with vectors as well.

#include <cstddef>
#include <cassert>

int c_style_sum(std::size_t len, int arr[]) {
    int sum = 0;
    for (size_t i = 0; i < len; i++) {
        sum += arr[i];
    }
    return sum;
}

int main() {
    int sum = c_style_sum(0, nullptr);
    assert(sum == 0);
}
fn sum_slice(arr: &[i32]) -> i32 {
    let mut sum = 0;
    for x in arr {
        sum += x;
    }
    sum
}

fn main() {
    let sum = sum_slice(&[]);
    assert!(sum == 0);

    let sum2 = sum_slice(&vec![]);
    assert!(sum2 == 0);
}

Encapsulation

In C++ the encapsulation boundary is the class. In Rust the encapsulation boundary is the module, which may contain several types along with standalone functions. In larger projects, the crate may also act as an encapsulation boundary.

This difference means that in Rust one is more likely to have multiple, tightly coupled types that work together which are defined in one module and encapsulated as a whole.

This section provides ways to translate between C++ and Rust's notions of encapsulation both mechanically and conceptually.

Header files

One use of header files in C++ is to expose declarations that are defined in one translation units to other translation units without requiring the duplication of the declarations in multiple files. By convention, declarations that are not included in the header are considered to be private to the defining translation unit (though, to enforce this convention other mechanisms, such as anonymous namespaces, are required).

In contrast, Rust uses neither textually-included header files nor forward declarations. Instead, Rust modules control visibility and linkage simultaneously and expose public definitions for use by other modules.

// person.h
class Person {
  std::string name;

public:
  Person(std::string name) : name(name) {}
  const std::string &getName();
};

// person.cc
#include <string>
#include "person.h"

const std::string &Person::getName() {
  return this->name;
}

// client.cc
#include <string>
#include "person.h"

int main() {
  Person p("Alice");
  const std::string &name = p.getName();

  // ...
}
// person.rs
pub struct Person {
    name: String,
}

impl Person {
    pub fn new(name: String) -> Person {
        Person { name }
    }

    pub fn name(&self) -> &String {
        &self.name
    }
}

// client.rs
mod person;

use person::*;

fn main() {
    let p = Person::new("Alice".to_string());
    // doesn't compile, private field
    // let name = p.name;
    let name = p.name();

    //...
}

In person.rs, the Person type is public but the name field is not. This prevents both direct construction of values of the type (similar to private members preventing aggregate initialization in C++) and prevents field access. The static method Person::new(String) and method Person::name() are exposed to clients of the module by the pub visibility declarations.

In the client module, the mod declaration defines the content of person.rs as a submodule named person. The use declaration brings the contents of the person module into scope.

The essence of the difference

A C++ program is a collection of translation units. Header files are required to make providing forward declarations of definitions from other translation units manageable.

A Rust program is a tree of modules. Definitions in one module may access items from other modules based on visibility declarations given in the definitions of the module themselves.

Submodules and additional visibility features

Modules and visibility declarations are more powerful than shown in the above example. More details on how to use modules, pub, and use to achieve encapsulation goals are described in the chapter on private members and friends.

Anonymous namespaces and static

Anonymous namespaces in C++ are used to avoid symbol collisions between different translation units. Such collisions violate the one definition rule and result in undefined behavior (which at best manifests as linking errors).

For example, without the use of anonymous namespaces, the following would result in undefined behavior (and no linking error, due to the use of inline producing weak symbols in the object files).

/// a.cc
namespace {
    inline void common_function_name() {
        // ...
    }
}

/// b.cc
namespace {
    inline void common_function_name() {
        // ...
    }
}

C++ static declarations are also used to achieve the same goal by making it so that a declaration has internal linkage (and so is not visible outside of the translation unit).

Rust avoids the linkage problem by controlling linkage and visibility simultaneously, with declarations always also being definitions. Instead of translation units, programs are structured in terms of modules, which provide both namespaces and visibility controls over definitions, enabling the Rust compiler to guarantee that symbol collision issues cannot happen.

The following Rust program achieves the same goal as the C++ program above, in terms of avoiding the collision of the two functions while making them available for use within the defining files.

#![allow(unused)]
fn main() {
// a.rs
mod a {
fn common_function_name() {
    // ...
}
}

// b.rs
mod b {
fn common_function_name() {
    // ...
}
}
}

Additionally,

  1. Unlike C++ namespaces, Rust modules (which provide namespacing as well as visibility controls) can only be defined once, and this is checked by the compiler.
  2. Each file defines a module which has to be explicitly included in the module hierarchy.
  3. Modules from Rust crates (libraries) are always qualified with some root module name, so they cannot conflict. If they would conflict, the root module name must be replaced with some user-chosen name.

Caveats about C interoperability

When using libraries not managed by Rust, the usual problems can occur if there are symbol collisions in the object files. This can arise when using C or C++ static or dynamic libraries. It can also arise when using Rust static or dynamic libraries built for use in C or C++ programs.

Rust provides #[unsafe(no_mangle)] to bypass name mangling in order to produce functions that can be easily referred to from C or C++. This can also cause undefined behavior due to name collision.

Private members and friends

Private members

In C++ the unit of encapsulation is the class. Access specifiers (private, protected, and public) that control access to members are enforced at the class boundary.

In Rust the module is the unit of encapsulation. Item visibility (Rust's analog to access specifiers) controls access to items at the module boundary.

#include <iostream>
#include <string>

class Person {
  int age;

public:
  std::string name;

  // Because age is private, a public constructor
  // method is needed to create instances.
  Person(std::string name, int age)
      : name(name), age(age) {}

  // Free functions cannot access private members,
  // so this has to be a member function.
  static void example() {
    Person alice{"Alice", 42};
    std::ctout << alice.name << cout::endl;
    // The private field is visible here, within
    // the class.
    std::ctout << alice.age << cout::endl;
  }
};

int main() {
  Person alice("Alice", 42);
  std::cout << alice.name << std::endl;
  // compilation error
  // std::cout << alice.age << std::endl;
}
mod person {
    pub struct Person {
        pub name: String,
        // this field is private
        age: i32,
    }

    impl Person {
        // Because age is private, a public
        // constructor method is needed to create
        // values outside of the person module.
        pub fn new(
            name: String,
            age: i32,
        ) -> Person {
            Person { name, age }
        }
    }

    // Free functions in the same module can
    // access private fields because the unit of
    // encapsulation is the module, not the
    // struct.
    fn example() {
        let alice =
            Person::new("Alice".to_string(), 42);
        println!("{}", alice.name);
        // The private field is visible here,
        // within the module.
        println!("{}", alice.age);
    }
}

use person::Person;

fn main() {
    let alice =
        Person::new("Alice".to_string(), 42);
    println!("{}", alice.name);
    // compilation error
    // println!("{}", alice.age);
}

In the Rust example, the constructor for Person is private because one of the fields is private.

Friends

Because encapsulation is at the module level in Rust, associated methods for types can access internals of other types defined in the same module. This subsumes most uses of the C++ friend declaration.

For example, defining a binary tree in C++ requires that the class representing the nodes of the tree declare the main binary tree class as a friend in order for it to access internal methods while keeping them private from other uses. This would be required even if the TreeNode class were defined as an inner class of BinaryTree.

In Rust, however, both types can be defined in the same module, and so have access to each other's private fields and methods. The module as a whole provides a collection of types, methods, and functions that together define a encapsulated concept.

#include <memory>

class BinaryTree {
  // This needs to be an inner class in order for
  // it to be private.
  class TreeNode {
    friend class BinaryTree;

    int value;
    std::unique_ptr<TreeNode> left;
    std::unique_ptr<TreeNode> right;

  public:
    TreeNode(int value)
        : value(value), left(nullptr),
          right(nullptr) {}

  private:
    static void
    insert(std::unique_ptr<TreeNode> &node,
           int value) {
      if (node) {
        node->insert(value);
      } else {
        node = std::make_unique<TreeNode>(value);
      }
    }

    void insert(int value) {
      if (value < this->value) {
        insert(this->left, value);
      } else {
        insert(this->right, value);
      }
    }
  };

  std::unique_ptr<TreeNode> root;

public:
  BinaryTree() : root(nullptr) {}

  void insert(int value) {
    TreeNode::insert(root, value);
  }
};

int main() {
  BinaryTree b;
  b.insert(42);

  return 0;
}
mod binary_tree {
    pub struct BinaryTree {
        // This field is not visible outside of
        // the module.
        root: Option<Box<TreeNode>>,
    }

    impl BinaryTree {
        pub fn new() -> BinaryTree {
            BinaryTree { root: None }
        }

        pub fn insert(&mut self, value: i32) {
            insert(&mut self.root, value);
        }
    }

    // This struct and all its fields are not
    // visible outside of the module.
    struct TreeNode {
        value: i32,
        left: Option<Box<TreeNode>>,
        right: Option<Box<TreeNode>>,
    }

    impl TreeNode {
        fn new(value: i32) -> TreeNode {
            TreeNode {
                value,
                left: None,
                right: None,
            }
        }

        fn insert(&mut self, value: i32) {
            if value < self.value {
                insert(&mut self.left, value);
            } else {
                insert(&mut self.right, value);
            }
        }
    }

    // This free function is not visible outside
    // of the module.
    fn insert(
        node: &mut Option<Box<TreeNode>>,
        value: i32,
    ) {
        match node {
            None => {
                *node = Some(Box::new(
                    TreeNode::new(value),
                ));
            }
            Some(ref mut left) => {
                left.insert(value);
            }
        }
    }
}

// This brings the (public) type into scope.
use binary_tree::BinaryTree;

fn main() {
    let mut b = BinaryTree::new();
    b.insert(42);
}

Passkey idiom

In the previous C++ example, the TreeNode constructor has to be public in order to be used with make_unique. Fortunately, the constructor is still inaccessible outside of the containing class, but it is not always the case that such helper classes can be inner classes.

To make the constructor effectively private when it is not possible, one might need to use a programming pattern like the passkey idiom.

The passkey idiom is also sometimes used to provide finer-grained control over access to members than is possible with friend declarations. In either case, the effect is achieved by modeling a capability-like system.

In Rust, it is possible to express the same idiom in order to achieve the same effect.

#include <iostream>
#include <memory>
#include <string>

class Person {
  int age;

  class Passkey {};

public:
  std::string name;

  Person(Passkey, std::string name, int age)
      : name(name), age(age) {}

  static std::unique_ptr<Person>
  createPerson(std::string name, int age) {
    // Other uses of make_unique are not possible
    // because the Passkey type cannot be
    // constructed.
    return std::make_unique<Person>(Passkey(),
                                    name, age);
  }
};
pub trait Maker<K, B> {
    fn make(passkey: K, args: B) -> Self;
}

// Generic helper that we want to be able to call
// an otherwise private function or method.
fn alloc_thing<K, B, T: Maker<K, B>>(
    passkey: K,
    args: B,
) -> Box<T> {
    Box::new(Maker::<K, B>::make(passkey, args))
}

mod person {
    use super::*;
    use std::marker::PhantomData;

    pub struct Person {
        pub name: String,
        age: u32,
    }

    // A zero-sized type to act as the passkey.
    pub struct Passkey {
        // This field is zero-sized. It is also
        // private, which prevents construction
        // of Passkey outside of the person
        // module.
        _phantom: PhantomData<()>,
    }

    impl Person {
        // Private method that will be exposed
        // with a passkey wrapper.
        fn new(name: String, age: u32) -> Person {
            Person { name, age }
        }

        // Method that uses external helper that
        // requires access to another
        // otherwise-private method.
        fn alloc(
            name: String,
            age: u32,
        ) -> Box<Person> {
            alloc_thing(
                Passkey {
                    _phantom: PhantomData {},
                },
                MakePersonArgs { name, age },
            )
        }
    }

    // Helper structure needed to make the trait
    // providing the interface generic.
    pub struct MakePersonArgs {
        pub name: String,
        pub age: u32,
    }

    // Implementation of the trait that exposes
    // the method requiring a passkey.
    impl Maker<Passkey, MakePersonArgs> for Person {
        fn make(
            _passkey: Passkey,
            args: MakePersonArgs,
        ) -> Person {
            Person::new(args.name, args.age)
        }
    }
}

fn main() {}

However the Passkey idiom is unlikely to be used in Rust because

  • coupled types are usually defined in the same module (or a pub (in path) declaration can be used), making it unnecessary, and
  • it requires cooperation from the interface by which the calling function will use a type.

The second point contrasts with the use above involving std::make_unique which is able to forward to the underlying constructor without knowing about it at the point of the definition of std::make_unique. While the example below is not useful (because alloc_thing is not a useful helper), it does demonstrate what would types have to be defined in order to achieve the same effect as when using the idiom in C++.

Friends and testing

Another common use of friend declarations is to make the internals of a class available for unit testing. Though this practice is often discouraged in C++, it is sometimes necessary in order to test other-wise private helper inner classes or helper methods.

In Rust, tests are usually defined in the same module as the code being tested. Because the content of modules is visible to submodules, this makes it so that all of the content of the module is available for testing.

// Using Boost.Test
// https://www.boost.org/doc/libs/1_84_0/libs/test/doc/html/index.html
#include <string>

class Person {
public:
  std::string name;

private:
  int age;

  friend class PersonTest;

public:
  Person(std::string name, int age)
      : name(name), age(age) {}

  void have_birthday() {
    this->age = this->age + 1;
  }
};

#define BOOST_TEST_MODULE PersonTestModule
#include <boost/test/included/unit_test.hpp>

class PersonTest {
public:
  static void test_have_birthday() {
    Person alice("Alice", 42);
    BOOST_CHECK_EQUAL(alice.age, 42);

    alice.have_birthday();
    BOOST_CHECK_EQUAL(alice.age, 43);
  }
};

BOOST_AUTO_TEST_CASE(have_birthday_test) {
  PersonTest::test_have_birthday();
}
#![allow(unused)]
fn main() {
pub struct Person {
    pub name: String,
    age: u32,
}

impl Person {
    pub fn new(name: String, age: u32) -> Person {
        Person { name, age }
    }

    pub fn have_birthday(&mut self) {
        self.age = self.age + 1;
    }
}

#[cfg(test)]
mod test {
    use super::Person;

    #[test]
    fn test_have_birthday() {
        let mut alice =
            Person::new("alice".to_string(), 42);

        assert_eq!(alice.age, 42);
        alice.have_birthday();
        assert_eq!(alice.age, 43);
    }
}
}

Visibility of methods on Rust traits

Because traits in Rust are intended for the definition of interfaces, the methods for some type that are declared by a trait are visible whenever both the trait and the type are visible. In other words, it is not possible to have private trait methods.

The default visibility for trait methods differs from Rust structs where the default visibility is private to the defining module.

Private constructors and friends

In C++ one can control which classes can derive from a specific class by making all of the constructors private and then declaring classes which may derive from it as friends.

In Rust, one can achieve the similar goal of controlling which types can implement a trait by using the sealed trait pattern.

Private constructors

In C++ constructors for classes can be made private by declaring them private, or by defining a class using class and using the default private visibility.

In Rust, constructors (the actual constructors, not "constructor methods") for structs are visible from wherever the type and all fields are visible. To achieve similar visibility restrictions as in the C++ example, an additional private field needs to be added to the struct in Rust. Because Rust supports zero-sized types, the additional field can have no performance cost. The unit type has zero size and can be used for this purpose.

#include <string>

struct Person {
  std::string name;
  int age;

private:
  Person() = default;
};

int main() {
  // fails to compile, Person::Person() private
  // Person nobody;

  // fails to compile since C++20
  // Person alice{"Alice", 42};
  return 0;
}
mod person {
    pub struct Person {
        pub name: String,
        pub age: i32,
        _private: (),
    }

    impl Person {
        pub fn new(
            name: String,
            age: i32,
        ) -> Person {
            Person {
                name,
                age,
                _private: (),
            }
        }
    }
}

use person::*;

fn main() {
    // field `_private` of struct `person::Person`
    // is private
    // let alice = Person {
    //     name: "Alice".to_string(),
    //     age: 42,
    //     _private: (),
    // };

    // cannot construct `person::Person` with
    // struct literal syntax due to private fields
    // let bob = Person {
    //     name: "Bob".to_string(),
    //     age: 55,
    // };

    let carol =
        Person::new("Carol".to_string(), 20);
    // Can match on the public fields, and then
    // use .. to ignore the remaning ones.
    let Person { name, age, .. } = carol;
}

Enums

Unlike C++ unions, but like std::variant, Rust enums do not have direct control over the visibility of their variants or the fields of their variants. In the following example, the circle variant of the Shape union is not public, so it can only be accessed from within the definition of Shape, as it is by the make_circle static method.

#include <iostream>

struct Triangle {
  double base;
  double height;
};

struct Circle {
  double radius;
};

union Shape {
  Triangle triangle;

private:
  Circle circle;

public:
  static Shape make_circle(double radius) {
    Shape s;
    s.circle = Circle(radius);
    return s;
  };
};

int main() {
  Shape triangle;
  triangle.triangle = Triangle{1.0, 2.0};
  Shape circle = Shape::make_circle(1.0);

  // fails to compile
  // circle.circle = Circle{1.0};

  // fails to compile
  // std::cout << shape.circle.radius;
}

In Rust visibility modifiers cannot be applied to individual enum variants or their fields.

mod shape {
    pub enum Shape {
        Triangle { base: f64, height: f64 },
        Circle { radius: f64 },
    }
}

use shape::*;

fn main() {
    // Variant constructor is accesssible despite not being marked pub.
    let triangle = Shape::Triangle {
        base: 1.0,
        height: 2.0,
    };

    let circle = Shape::Circle { radius: 1.0 };

    // Fields accessbile despite not being marked pub.
    match circle {
        Shape::Triangle { base, height } => {
            println!("Triangle: {}, {}", base, height);
        }
        Shape::Circle { radius } => {
            println!("Circle {}", radius);
        }
    }
}

Instead, to control construction of and pattern matching on the enum implementation, one of two approaches can be taken. The first controls construction of and access to the fields, but not inspection of which variant is active.

mod shape {
    pub struct Triangle {
        pub base: f64,
        pub height: f64,
        _private: (),
    }
    pub struct Circle {
        pub radius: f64,
        _private: (),
    }

    pub enum Shape {
        Triangle(Triangle),
        Circle(Circle),
    }

    impl Shape {
        pub fn new_triangle(base: f64, height: f64) -> Shape {
            Shape::Triangle(Triangle {
                base,
                height,
                _private: (),
            })
        }

        pub fn new_circle(radius: f64) -> Shape {
            Shape::Circle(Circle {
                radius,
                _private: (),
            })
        }
    }
}

use shape::*;

fn main() {
    let triangle = Shape::new_triangle(1.0, 2.0);
    let circle = Shape::new_circle(1.0);

    match circle {
        Shape::Triangle(Triangle { base, height, .. }) => {
            println!("Triangle: {}, {}", base, height);
        }
        Shape::Circle(Circle { radius, .. }) => {
            println!("Circle: {}", radius);
        }
    }
}

The second places the enum in a struct with a private field, preventing both construction and inspection from outside of the module.

mod shape {
    enum ShapeKind {
        Triangle { base: f64, height: f64 },
        Circle { radius: f64 },
    }

    pub struct Shape(ShapeKind);

    impl Shape {
        pub fn new_circle(radius: f64) -> Shape {
            Shape(ShapeKind::Circle { radius })
        }

        pub fn new_triangle(base: f64, height: f64) -> Shape {
            Shape(ShapeKind::Triangle { base, height })
        }

        pub fn print(&self) {
            match self.0 {
                ShapeKind::Triangle { base, height } => {
                    println!("Triangle: {}, {}", base, height);
                }
                ShapeKind::Circle { radius } => {
                    println!("Circle: {}", radius);
                }
            }
        }
    }
}

use shape::*;

fn main() {
    let triangle = Shape::new_triangle(1.0, 2.0);
    let circle = Shape::new_circle(1.0);

    // Does not compile because Shape has private fields.
    // match circle {
    //   Shape(_) -> {}
    // }

    circle.print();
}

If the purpose of making the variants private is to ensure that invariants are met, then it can be useful to expose the implementing enum (ShapeKind) but not the field of the wrapping struct (Shape), with the invariants only being guaranteed when the wrapping struct is used. In this case, it is necessary to make the field private and define a getter function, since otherwise the field would be modifiable, possibly violating the invariant that the wrapping struct represents.

mod shape {
    pub enum ShapeKind {
        Triangle { base: f64, height: f64 },
        Circle { radius: f64 },
    }

    // The field of Shape is private.
    pub struct Shape(ShapeKind);

    impl Shape {
        pub fn new(kind: ShapeKind) -> Option<Shape> {
            // ... check invariants ...
            Some(Shape(kind))
        }

        pub fn get_kind(&self) -> &ShapeKind {
            &self.0
        }
    }
}

use shape::*;

fn main() {
    let triangle = Shape::new(ShapeKind::Triangle {
        base: 1.0,
        height: 2.0,
    });
    let Some(circle) = Shape::new(ShapeKind::Circle { radius: 1.0 }) else {
        return;
    };

    // Does not compile because Shape has private fields.
    // match circle {
    //   Shape(c) => {}
    // };

    match circle.get_kind() {
        ShapeKind::Triangle { base, height } => {
            println!("Triangle: {}, {}", base, height);
        }
        ShapeKind::Circle { radius } => {
            println!("Circle: {}", radius);
        }
    }
}

The situation in Rust resembles the situation in C++ when using std::variant, for which it is not possible to make the variants themselves private. Instead either the constructors for the types that form the variants can be made private or the variant can be wrapped in a class with appropriate visibility controls.

Rust's #[non_exhaustive] annotation

If a struct or enum is intended to be public within a crate, but should not be constructed outside of the crate, then the #[non_exhaustive] attribute can be used to constrain construction. The attribute can be applied to both structs and to individual enum variants with the same effect as adding a private field.

However, the attribute applies the constraint at the level of the crate, not at the level of a module.

#![allow(unused)]
fn main() {
#[non_exhaustive]
pub struct Person {
    pub name: String,
    pub age: i32,
}

pub enum Shape {
    #[non_exhaustive]
    Triangle { base: f64, height: f64 },
    #[non_exhaustive]
    Circle { radius: f64 },
}
}

The attribute is more typically used to force clients of a library to include the wildcard when matching on the struct fields, making it so that adding additional fields to a struct is not breaking change (i.e., that it does not require the increase of the major version component when using semantic versioning).

Applying the #[non_exhasutive] attribute to the enum itself makes it as if one of the variants were private, requiring a wildcard when matching on the variant itself. This has the same effect in terms of versioning as when used on a struct but is less advantageous. In most cases, code failing to compile when a new enum variant is added is desirable, since that indicates a new case that requires handling logic.

Setter and getter methods

Setters and getters work similarly in C++ and Rust, but are used less frequently in Rust.

It would not be unusual to see the following representation of a two-dimensional vector in C++, which hides its implementation and provides setters and getters to access the fields. This choice would typically be made in case a representation change (such as using polar instead of rectangular coordinates) needed to be made later without breaking clients.

On the other hand, in Rust such a type would almost always be defined with public fields.

class Vec2 {
  double x;
  double y;

public:
  Vec2(double x, double y) : x(x), y(y) {}
  double getX() { return x; }
  double getY() { return y; }

  // ... vector operations ...
};
#![allow(unused)]
fn main() {
pub struct Vec2 {
    // public fields instead of getters
    pub x: f64,
    pub y: f64,
}

impl Vec2 {
    // ... vector operations ...
}
}

One major reason for the difference is a limitation of the borrow checker. With a getter function the entire structure is borrowed, preventing mutable use of other fields of the structure.

The following program will not compile because get_name() borrows all of alice.

struct Person {
    name: String,
    age: u32,
}

impl Person {
    fn get_name(&self) -> &String {
        &self.name
    }
}

fn main() {
    let mut alice = Person { name: "Alice".to_string(), age: 42 };
    let name = alice.get_name();

    alice.age = 43;

    println!("{}", name);
}
error[E0506]: cannot assign to `alice.age` because it is borrowed
  --> example.rs:16:5
   |
14 |     let name = alice.get_name();
   |                ----- `alice.age` is borrowed here
15 |
16 |     alice.age = 43;
   |     ^^^^^^^^^^^^^^ `alice.age` is assigned to here but it was already borrowed
17 |
18 |     println!("{}", name);
   |                    ---- borrow later used here

error: aborting due to 1 previous error

Some additional reasons for the difference in approach are:

  • Ergonomics: Public members make it possible to use pattern matching.
  • Transparency of performance: A change in representation would dramatically change the costs involved with the getters. Exposing the representation makes the cost change visible.
  • Control over mutability: Static lifetime checking of mutable references removes concerns of unintended mutation of the value through Rust's equivalent of observation pointers.

Types with invariants and newtypes

When types need to preserve invariants but the benefits of exposing fields are desired, a newtype pattern can be used. A wrapping "newtype" struct that represents the data with an invariant is defined and access to the fields of the underlying struct is provided by via a non-mut reference.

#![allow(unused)]
fn main() {
pub struct Vec2 {
    pub x: f64,
    pub y: f64,
}

/// Represents a 2-vector that has magnitude 1.
pub struct Normalized(Vec2); // note the private field

fn sqrt_approx_zero(x: f64) -> bool {
    x < 0.001
}

impl Normalized {
    pub fn from_vec2(v: Vec2) -> Option<Self> {
        if sqrt_approx_zero(v.x * v.x + v.y * v.x - 1.0) {
            Some(Self(v))
        } else {
            None
        }
    }

    // The getter provides a reference to the underlying Vec2 value
    // without permitting mutation.
    pub fn get(&self) -> &Vec2 {
        &self.0
    }
}
}

Borrowing from indexed structures

A significant limitation that arises from the way that getter methods interact with the borrow checker is that it isn't possible to mutably borrow multiple elements from an indexed structure like a vector using a methods like Vec::get_mut.

The built-in indexed types have several methods for creating split views onto a structure. These can be used to create helper functions that match the requirements of a specific application.

The Rustonomicon has examples of implementing this pattern, using both safe and unsafe Rust.

Setter methods

Setter methods also borrow the entire value, which causes the same problems as getters that return mutable references. As with getter methods, setter methods are mainly used when needed to preserve invariants.

Exceptions and error handling

In C++ errors that are to be handled by the caller are sometimes indicated by sentinel values (e.g., std::map::find producing an empty iterator), sometimes indicated by exceptions (e.g., std::vector::at throwing std::out_of_range), and sometimes indicated by setting an error bit (e.g., std::fstream::fail). Errors that are not intended to be handled by the caller are usually indicated by exceptions (e.g., std::bad_cast). Errors that are due to programming bugs often just result in undefined behavior (e.g., std::vector::operator[] when the index is out-of-bounds).

In contrast, safe Rust has two mechanisms for indicating errors. When the error is expected to be handled by the caller (because it is due to, e.g., user input), the function returns a Result or Option. When the error is due to a programming bug, the function panics. Undefined behavior can only occur if unchecked variants of functions are used with unsafe Rust.

Many libraries in Rust will offer two versions of an API, one which returns a Result or Option type and one of which panics, so that the interpretation of the error (expected exceptional case or programmer bug) can be chosen by the caller.

The major differences between using Result or Option and using exceptions are that

  1. Result and Option force explicit handling of the error case in order to access the contained value. This also differs from std::expected in C++23.
  2. When propagating errors with Result, the types of the errors much match. There are libraries for making this easier to handle.

Result vs Option

The approaches demonstrated in the Rust examples in this chapter apply to both Result and Option. When the type is Option it indicates that there is no additional information to provide in the error case: Option::None does not contain a value, but Result::Err does. When there is no additional information, is usually because there is exactly one circumstance which can cause the error case.

It is possible to convert between the two types.

fn main() {
    let r: Result<i32, &'static str> =
        None.ok_or("my errror message");
    let r2: Result<i32, &'static str> =
        None.ok_or_else(|| "expensive error message");
    let o: Option<i32> = r.ok();
}

Expected errors

In C++, throw both produces an error (the thrown exception) and initiates non-local control flow (unwinding to the nearest catch block). In Rust, error values (Option::None or Result::Err) are returned as normal values from a function. Rust's return statement can be used to return early from a function.

#include <stdexcept>

double divide(double dividend, double divisor) {
  if (divisor == 0.0) {
    throw std::domain_error("zero divisor");
  }

  return dividend / divisor;
}
#![allow(unused)]
fn main() {
fn divide(
    dividend: f64,
    divisor: f64,
) -> Option<f64> {
    if divisor == 0.0 {
        return None;
    }

    Some(dividend / divisor)
}
}

The requirement to have the return type indicate that an error is possible means that callbacks that are permitted to have errors need to be given an Option or Result return type. Omitting that is like requiring callbacks to be noexcept in C++. Functions that do not need to indicate errors but that will be used as callbacks where errors are permitted will need to wrap their results in Option::Some or Result::Ok.

#include <stdexcept>

int produce_42() {
  return 42;
}

int fail() {
  throw std::runtime_error("oops");
}

int useCallback(int (*func)(void)) {
  return func();
}

int main() {
  try {
    int x = useCallback(produce_42);
    int y = useCallback(fail);

    // use x and y
  } catch (std::runtime_error &e) {
    // handle error
  }
}
fn produce_42() -> i32 {
    42
}

fn fail() -> Option<i32> {
    None
}

fn use_callback(
    f: impl Fn() -> Option<i32>,
) -> Option<i32> {
    f()
}

fn main() {
    // need to wrap produce_42 to match the
    // expected type
    let Some(x) =
        use_callback(|| Some(produce_42()))
    else {
        // handle error
        return;
    };
    let Some(y) = use_callback(fail) else {
        // handle error
        return;
    };
    // use x and y
}

Handling errors

In C++, the only way to handle exceptions is catch. In Rust, all of the features for dealing with tagged unions can be used with Result and Option. The most approach depends on the intention of the program.

The basic way of handling an error indicated by a Result in Rust is by using match.

Using match is the most general approach, because it enables handling additional cases explicitly and can be used as an expression. match connotes equal importance of all branches.

#include <vector>
#include <stdexcept>

int main() {
    std::vector<int> v;
    // ... populate v ...
    try {
        auto x = v.at(0);
        // use x
    } catch (std::out_of_range &e) {
        // handle error
    }
}
fn main() {
    let mut v = Vec::<i32>::new();
    // ... populate v ...
    match v.get(0) {
        Some(x) => {
            // use x
        }
        None => {
            // handle error
        }
    }
}

Because handling only a single variant of a Rust enum is so common, the if let syntax support that use case. The syntax both makes it clear that only the one case is important and reduces the levels of indentation.

if let is less general than match. It can also be used as an expression, but can only distinguish one case from the rest. if let connotes that the else case is not the normal case, but that some default handling will occur or some default value will be produced.

Note that with Result, if let does not enable accessing the error value.

fn main() {
    let mut v = Vec::<i32>::new();
    // ... populate v ...
    if let Some(x) = v.get(0) {
        // use x
    } else {
        // handle error
    }
}

When the error handling involves some kind of control flow operation, like break or return, the let else syntax is even more concise.

Much like normal let statements, let else statements can only be used where statements are expected. let else statements also connote that the else case is not the normal case, and that no further (normal) processing will occur.

fn main() {
    let mut v = Vec::<i32>::new();
    // ... populate v ...
    let Some(x) = v.get(0) else {
        // handle error
        return;
    };
    // use x
}

Result and Option also have some helper methods for handling errors. These methods resemble the methods on std::expected in C++.

#include <expected>
#include <string>

int main() {
  std::expected<int, std::string> res(42);
  auto x(res.transform([](int n) { return n * 2; }));
}
fn main() {
    let res: Result<i32, String> = Ok(42);
    let x = res.map(|n| n * 2);
}

These helper methods and others are described in detail in the documentation for Option and Result.

Borrowed results

In the above examples, the successful results are borrowed from the vector. It common to need to clone or copy the result into an owned copy, and to want to do so without having to match on and reconstruct the value. Result and Option have helper methods for these purposes.

fn main() {
    let mut v = Vec::<i32>::new();
    v.push(42);
    let x: Option<&i32> = v.get(0);
    let y: Option<i32> = v.get(0).copied();

    let mut w = Vec::<String>::new();
    w.push("hello".to_string());
    let s: Option<&String> = w.get(0);
    let r: Option<String> = w.get(0).cloned();
}

Propagating errors

In C++, exceptions propagate automatically. In Rust, errors indicated by Result or Option must be explicitly propagated. The ? operator is a convenience for this. There are also several methods for manipulating Result and Option that have a similar effect to propagating the error.

#include <cstddef>
#include <vector>

int accessValue(std::vector<std::size_t> indices,
                 std::vector<int> values,
                 std::size_t i) {
  // vector::at throws
  size_t idx(indices.at(i));
  // vector::at throws
  return values.at(idx);
}
#![allow(unused)]
fn main() {
fn access_value(
    indices: Vec<usize>,
    values: Vec<i32>,
    i: usize,
) -> Option<i32> {
    // * dereferences the &i32 to copy it
    // ? propagates the None
    let idx = *indices.get(i)?;
    // returns the Option directly
    values.get(idx).copied()
}
}

The above Rust example is equivalent to the following, which does not use the ? operator. The version using ? is more idiomatic.

#![allow(unused)]
fn main() {
fn access_value(
    indices: Vec<usize>,
    values: Vec<i32>,
    i: usize,
) -> Option<i32> {
    // matching through the & makes a copy of the i32
    let Some(&idx) = indices.get(i) else {
        return None;
    };
    // still returns the Option directly
    values.get(idx).copied()
}
}

The following example is also equivalent. It is not idiomatic (using ? here is more readable), but does demonstrate one of the helper methods. Option::and_then is similar to std::optional::and_then in C++23.

#![allow(unused)]
fn main() {
fn access_value(
    indices: Vec<usize>,
    values: Vec<i32>,
    i: usize,
) -> Option<i32> {
    // matching through the & makes a copy of the i32
    indices
        .get(i)
        .and_then(|idx| values.get(*idx))
        .copied()
}
}

These helper methods and others are described in detail in the documentation for Option and Result.

Uncaught exceptions in main

In C++ when an exception is uncaught, it terminates the program with a non-zero exit code and an error message. To achieve a similar result using Result in Rust, main can be given a return type of Result.

#include <stdexcept>

int main() {
  throw std::runtime_error("oops");
}
fn main() -> Result<(), &'static str> {
    Err("oops")
}

The result type must be unit () and the error type can be any type that implements the Debug trait.

#[derive(Debug)]
struct InterestingError {
    message: &'static str,
    other_interesting_value: i32,
}

fn main() -> Result<(), InterestingError> {
    Err(InterestingError {
        message: "oops",
        other_interesting_value: 9001,
    })
}

Running this program produces the output Error: InterestingError { message: "oops", other_interesting_value: 9001 } with an exit code of 1.

Limitations to forcing error handling with Result

Returning Result or Option does not give the usual benefits when used with APIs that pass pre-allocated buffers by mutable reference. This is because the buffer is accessible outside of the Result or Option, and so the compiler cannot force handling of the error case.

For example, in the following example the result of read_line can be ignored, resulting in logic errors in the program. However, since the buffer is required to be initialized, it will not result in memory safety violations or undefined behavior.

fn main() {
    let mut buffer = String::with_capacity(1024);
    std::io::stdin().read_line(&mut buffer);
    // use buffer
}

Rust will produce a warning in this case, because of the #[must_use] attribute on Result.

warning: unused `Result` that must be used
 --> example.rs:3:5
  |
3 |     std::io::stdin().read_line(&mut buffer);
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: this `Result` may be an `Err` variant, which should be handled
  = note: `#[warn(unused_must_use)]` on by default
help: use `let _ = ...` to ignore the resulting value
  |
3 |     let _ = std::io::stdin().read_line(&mut buffer);
  |     +++++++

Option does not have a #[must_use] attribute, so functions that return an Option that must be handled (due to the None case indicating an error) should be annotated with the #[must_use] attribute. For example, the get method on slices returns Option and is annotated as #[must_use].

Type equivalents

The type equivalents listed in this document are equivalent for the purposes of programming in Rust as one would program in C++. They are not necessarily equivalent in terms of being useful for interacting with C or C++ programs via an FFI. For types that are useful for interoperability with C or C++, see the Rust std::ffi module documentation and the FFI documentation in the Rustonomicon.

Primitive types

Integer types

In C++, many of the integer types (like int and long) have implementation defined widths. In Rust, integer types are always specified with their widths, much like the types in <cstdint> in C++. When it isn't clear what integer type to use, it is common to default to i32, which is the type that Rust defaults to for integer literals.

C++ typeRust type
uint8_tu8
uint16_tu16
uint32_tu32
uint64_tu64
int8_ti8
int16_ti16
int32_ti32
int64_ti64
size_tusize
isize

In C++ size_t is conventionally used only for sizes and offsets. The same is true in Rust for usize, which is the pointer-sized integer type. The isize type is the signed equivalent of usize and has no direct equivalent in C++. The isize type is typically only used to represent pointer offsets.

Floating point types

As with integer types in C++, the floating point types float, double, and long double have implementation defined widths. C++23 introduced types guaranteed to be IEEE 754 floats of specific widths. Of those, float32_t and float64_t correspond to what is usually expected from float and double. Rust's floating point types are analogous to these.

C++ typeRust type
float16_t
float32_tf32
float64_tf64
float128_t

The Rust types analogous to float16_t and float128_t (f16 and f128) are not yet available in stable Rust.

Raw memory types

In C++ pointers to or arrays of char, unsigned char, or byte are used to represent raw memory. In Rust, arrays ([u8; N]), vectors (Vec<u8>), or slices (&[u8]) of u8 are used to accomplish the same goal. However, accessing the underlying memory of another Rust value in that way requires unsafe Rust. There are libraries for creating safe wrappers around that kind of access for purposes such as serialization or interacting with hardware.

Character and string types

The C++ char or wchar_t types have implementation defined widths. Rust does not have an equivalent to these types. When working with string encodings in Rust one would use unsigned integer types where one would use the fixed width character types in C++.

C++ typeRust type
char8_tu8
char16_tu16

The Rust char type represents a Unicode scalar value. Thus, a Rust char is the same size as a u32. For working with characters in Rust strings (which are guaranteed to be valid UTF-8), the char type is appropriate. For representing a byte, one should instead use u8.

The Rust standard library includes a type for UTF-8 strings and string slices: String and &str, respectively. Both types guarantee that represented strings are valid UTF-8. The Rust char type is appropriate for representing elements of a String.

Because str (without the reference) is a slice, it is unsized and therefore must be used behind a pointer-like construct, such as a reference or box. For this reason, string slices are often described as &str instead of str in documentation, even though they can also be used as Box<str>, Rc<str>, etc.

Rust also includes types for platform-specific string representations and slices of those strings: std::ffi::OsString and &std::ffi::OsStr. While these strings use the OS-specific representation, to use one with the Rust FFI, it must still be converted to a CString.

Unlike C++ which has std::u16string, Rust has no specific representation for UTF-16 strings. Something like Vec<u16> can be used, but the type will not guarantee that its contents are a valid UTF-16 string. Rust does provide a mechanisms for converting String to and from a UTF-16 encoding (String::encode_utf16 and String::from_utf16, among others) as well as similar mechanisms for accessing the underlying UTF-8 encoding (https://doc.rust-lang.org/std/string/struct.String.html#method.from_utf8).

PurposeRust type
representing textString and &str
representing bytesvectors, arrays, or slices of u8
interacting with OSOsString and &OsStr
representing UTF-8String
representing UTF-16use a library

Boolean types

The bool type in Rust is analogous to the bool type in C++. Unlike C++, Rust makes guarantees about the size, alignment, and bit pattern used to represent values of the bool type.

void

In C++ void indicates that a function does not return a value. Because Rust is expression-oriented, all functions return values. In the place of void, Rust uses the unit type (). When a function does not have a return type declared, () is the return type.

#include <iostream>

void process() {
    std::cout
        << "Does something, but returns nothing."
        << std::endl;
}
#![allow(unused)]
fn main() {
fn process() {
    println!("Does something but returns nothing.");
}
}

Since the unit type has only one value (also written ()), values of the type provide no information. This also means that the return value can be left implicit, as in the above example. The following example makes the unit type usage explicit.

#![allow(unused)]
fn main() {
fn process() -> () {
    let () = println!("Does something but returns nothing.");
    ()
}
}

The syntax of the unit type and syntax of the unit value resemble that of an empty tuple. Essentially, that is what the type is. The following example shows some equivalent types, though without the special syntax or language integration.

struct Pair<T1, T2>(T1, T2); // the same as (T1, T2)
struct Single<T>(T); // a tuple with just one value (T1)
struct Unit; // the same as ()
// can also be written as
// struct Unit();

fn main() {
    let pair = Pair(1,2.0);
    let single = Single(1);
    let unit = Unit;
    // can also be written as
    // let unit = Unit();
}

Using a unit type instead of void enables expressions with unit type (such as function calls that would return void in C++) to be used in contexts that expect a value. This is especially helpful with defining and using generic functions, instead of needing something like std::is_void to special-case the handling when a type is void.

Pointers

The following table maps the ownership-managing classes from C++ to equivalents types in Rust.

UseC++ typeRust type
OwnedTT
Single owner, dynamic storagestd::unique_ptr<T>Box<T>
Shared owner, dynamic storage, immutable, not thread-safestd::shared_ptr<T>std::rc::Rc<T>
Shared owner, dynamic storage, immutable, thread-safestd::shared_ptr<T>std::sync::Arc<T>
Shared owner, dynamic storage, mutable, not thread-safestd::shared_ptr<T>std::rc::Rc<std::cell::RefCell<T>>
Shared owner, dynamic storage, mutable, thread-safestd::shared_ptr<std::mutex<T>>std::sync::Arc<std::mutex::Mutex<T>>
Const referenceconst &T&T
Mutable reference&T&mut T
Const observer pointerconst *T&T
Mutable observer pointer*T&mut T

In C++, the thread safety of std::shared_ptr is more nuanced than it appears in this table (e.g., some uses may require std::atomic). However, in safe Rust the compiler will prevent the incorrect use of the shared owner types.

Unlike with C++ references, Rust can have references-to-references. Rust references are more like observer pointers than they are like C++ references.

void*

Rust does not have anything directly analogous to void* in C++. The upcoming chapter on RTTI will cover some use cases where the goal is dynamic typing. The FFI chapter of the Rustonomicon covers some use cases where the goal is interoperability with C programs that use void*.

Containers

Both C++ and Rust containers own their elements. However, in both the element type may be a non-owning type, such as a pointer in C++ or a reference in Rust.

C++ typeRust type
std::vector<T>Vec<T>
std::array<T, N>[T; N]
std::list<T>std::collections::LinkedList<T>
std::queue<T>std::collections::VecDeque<T>
std::deque<T>std::collections::VecDeque<T>
std::stack<T>Vec<T>
std::map<K,V>std::collections::BTreeMap<K,V>
std::unordered_map<K,V>std::collections::HashMap<K,V>
std::set<K>std::collections::BTreeSet<K>
std::unordered_set<K>std::collections::HashSet<K>
std::priority_queue<T>std::collections::BinaryHeap<T>
std::span<T>&[T]

For maps and sets instead of the container being parameterized over the hash or comparison function used, the types require that the key types implement the std::hash::Hash (unordered) or std::cmp::Ord (ordered) traits. To use the containers with different hash or comparison functions, one must use a wrapper type with a different implementation of the required trait.

Some C++ container types provided by the STL have no equivalent in Rust. Many of those have equivalents available in third-party libraries.

One significant different in the use of these types between C++ in Rust is with the Vec<T> and array [T; N] types, from which slice references &[T] or &mut [T] to part or all of the data can be cheaply created. For this reason, when defining a function that does not modify the length of a vector and does not need to statically know the number of elements in an array, it is more idiomatic to take a parameter as &[T] or &mut [T] than as a reference to the owned type.

In C++ it is better to take begin and end iterators than a span when possible, since iterators are more general. The same is true with Rust and taking a generic type that implements IntoIter<&T> or IntoIter<&mut T> instead of &[T].

#include <iterator>
#include <vector>

template <typename InputIter>
void go(InputIter first, InputIter last) {
  for (auto it = first; it != last; ++it) {
    // ...
  }
}

int main() {
  std::vector<int> v = {1, 2, 3};
  go(v.begin(), v.end());
}
use std::iter::IntoIterator;

fn go<'a>(iter: impl IntoIterator<Item = &'a mut i32>) {
    for x in iter {
        // ...
    }
}

fn main() {
    let mut v = vec![1, 2, 3];
    go(&mut v);
}

Type promotions and conversions

lvalue to rvalue

In C++ lvalues are automatically converted to rvalues when needed.

In Rust the equivalent of lvalues are "place expressions" (expressions that represent memory locations) and the equivalent of rvalues are "value expressions". Place expressions are automatically converted to value expressions when needed.

int main() {
  // Local variables are lvalues,
  int x(0);
  // and therefore may be assigned to.
  x = 42;

  // x is converted to an lvalue when needed.
  int y = x + 1;
}
fn main() {
    // Local variables are place expressions,
    let mut x = 0;
    // and therefore may be assigned to.
    x = 42;

    // x is converted to a value expression when
    // needed.
    let y = x + 1;
}

Array to pointer

In C++, arrays are automatically converted to pointers as required.

The equivalent to this in Rust is the automatic conversion of vector and array references to slice references.

#include <cstring>

int main() {
  char example[6] = "hello";
  char other[6];

  // strncpy takes arguments of type char*
  strncpy(other, example, 6);
}
fn third(ts: &[char]) -> Option<&char> {
    ts.get(2)
}

fn main() {
    let vec: Vec<char> = vec!['a', 'b', 'c'];
    let arr: [char; 3] = ['a', 'b', 'c'];

    third(&vec);
    third(&arr);
}

Because slice references can be easily used in a memory-safe way, it is generally recommended in Rust to define functions in terms of slice references instead of in terms of references to vectors or arrays, unless vector-specific or array-specific functionality is needed.

Unlike in C++ where the conversion from arrays to pointers is built into the language, this is actually a general mechanism provided by the Deref trait, which provides one kind of user-defined conversion.

Function to pointer

In C++ functions and static member functions are automatically converted to function pointers.

Rust performs the same conversion. In addition to functions and members that do not take self as an argument, constructors (proper constructors) also have function type and can be converted to function pointers. Non-capturing closures do not have function type, but can also be converted to function pointers.

int twice(int n) {
  return n * n;
}

struct MyPair {
  int x;
  int y;

  MyPair(int x, int y) : x(x), y(y) {}

  static MyPair make() {
    return MyPair{0, 0};
  }
};

int main() {
  // convert a function to a function pointer
  int (*twicePtr)(int) = twice;
  int result = twicePtr(5);

  // Per C++23 11.4.5.1.6, can't take the address
  // of a constructor.
  // MyPair (*ctor)(int, int) = MyPair::MyPair;
  // MyPair pair = ctor(10, 20);

  // convert a static method to a function
  // pointer
  MyPair (*methodPtr)() = MyPair::make;
  MyPair pair2 = methodPtr();

  // convert a non-capturing closure to a
  // function pointer
  int (*closure)(int) = [](int x) -> int {
    return x * 5;
  };
  int closureRes = closure(2);
}
fn twice(x: i32) -> i32 {
    x * x
}

struct MyPair(i32, i32);

impl MyPair {
    fn new() -> MyPair {
        MyPair(0, 0)
    }
}

fn main() {
    // convert a function to a function pointer
    let twicePtr: fn(i32) -> i32 = twice;
    let res = twicePtr(5);

    // convert a constructor to a function pointer
    let ctorPtr: fn(i32, i32) -> MyPair = MyPair;
    let pair = ctorPtr(10, 20);

    // convert a static method to a function
    // pointer
    let methodPtr: fn() -> MyPair = MyPair::new;
    let pair2 = methodPtr();

    // convert a non-capturing closure to a
    // function pointer
    let closure: fn(i32) -> i32 = |x: i32| x * 5;
    let closureRes = closure(2);
}

Numeric promotion and numeric conversion

In C++ there are several kinds of implicit conversions that occur between numeric types. The most commonly encountered are numeric promotions, which convert numeric types to larger types.

These lossless conversions are not implicit in Rust. Instead, they must be performed explicitly using the Into::into() method. These conversions are provided by implementations of the From and Into traits. The list of conversions provided by the Rust standard library is listed on the documentation page for the trait.

int main() {
  int x(42);
  long y = x;

  float a(1.0);
  double b = a;
}
fn main() {
    let x: i32 = 42;
    let y: i64 = x.into();

    let a: f32 = 1.0;
    let b: f64 = a.into();
}

There are several implicit conversions that occur in C++ that are not lossless. For example, integers can be implicitly converted to unsigned integers in C++.

In Rust, these conversions are also required to be explicit and are provided by the TryFrom and TryInto traits which require handling the cases where the value does not map to the other type.

int main() {
  int x(42);
  unsigned int y(x);

  float a(1.0);
  double b(a);
}
use std::convert::TryInto;

fn main() {
    let x: i32 = 42;
    let y: u32 = match x.try_into() {
        Ok(x) => x,
        Err(err) => {
            panic!("Can't convert! {:?}", err);
        }
    };
}

Some conversions that occur in C++ are supported by neither From nor TryFrom because there is not a clear choice of conversion or because they are not value-preserving. For example, in C++ int32_t can implicitly be converted to float despite float not being able to represent all 32 bit integers precisely, but in Rust there is no TryFrom<i32> implementation for f32.

In Rust the only way to convert from an i32 to an f32 is with the as operator. The operator can actually be used to convert between other primitive types as well and does not panic or produce undefined behavior, but may not convert in the desired way (e.g., it may use a different rounding mode than desired or it may truncate rather than saturate as desired).

#include <cstdint>

int main() {
  int32_t x(42);
  float a = x;
}
fn main() {
    let x: i32 = 42;
    let a: f32 = x as f32;
}

isize and usize

In the Rust standard library the isize and usize types are used for values intended to used be indices (much like size_t in C++). However, their use for other purposes is usually discouraged in favor of using explicitly sized types such as u32. This results a situation where values of type u32 have to be converted to usize for use in indexing, but Into<usize> is not implemented for u32.

In these cases, best practice is to use TryInto, and if further error handling of the failure cause is not desired, to call unwrap, creating a panic at the point of conversion.

This is preferred because it prevents the possibility of moving forward with an incorrect value. E.g., consider converting a u64 to a usize that has a 32-bit representation with as, which truncates the result. A value that is one greater than the u32::MAX will truncate to 0, which would probably result in successfully retrieving the wrong value from a data structure, thus masking a bug and producing unexpected behavior.

Enums

In C++ enums can be implicitly converted to integer types.

In Rust the conversion requires the use of the as operator, and providing From and TryFrom implementations to move back and forth between the enum and its representation type is recommended. Examples and additional details are given in the chapter on enums.

Qualification conversion

In C++ qualification conversions enable the use of const (or volatile) values where the const (or volatile) qualifier is not expected.

In Rust the equivalent enables the use of mut variables and mut references to be used where non-mut variables or references are expected.

#include <iostream>
#include <string>

void display(const std::string &msg) {
  std::cout << "Displaying: " << msg << std::endl;
}

int main() {
  // no const qualifier
  std::string message("hello world");

  // used where const expected
  display(message);
}
fn display(msg: &str) {
    println!("{}", msg);
}

fn main() {
    let mut s: String = "hello world".to_string();
    let message: &mut str = s.as_mut();
    display(message);
}

Integer literals

In C++ integer literals with no suffix indicating type have the smallest type in which they can fit from int, long int, or long long int. When the literal is then assigned to a variable of a different type, an implicit conversion is performed.

In Rust, integer literals have their type inferred depending on context. When there is insufficient information to infer a type either i32 is assumed or may require some type annotation to be given.

#include <cstdint>
#include <iostream>

int main() {
  // Compiles without error (but with a warning).
  uint32_t x = 4294967296;

  // assumes int
  auto y = 1;

  // literal is given a larger type, so it prints
  // correctly
  std::cout << 4294967296 << std::endl;

  // these work as expected
  std::cout << INT64_C(4294967296) << std::endl;

  uint64_t z = INT64_C(4294967296);
  std::cout << z << std::endl;
}
fn main() {
    // error: literal out of range for `u32`
    // let x: u32 = 4294967296;

    // assumes i32
    let y = 1;

    // fails to compile because it is inferred as i32
    // print!("{}", 4294967296);

    // These work, though.
    println!("{}", 4294967296u64);

    let z: u64 = 4294967296;
    println!("{}", z);
}

Safe bools

The safe bool idiom exists to make it possible to use types as conditions. Since C++11 this idiom is straightforward to implement.

In Rust instead of converting the value to a boolean, the normal idiom matches on the value instead. Depending on the situation, the mechanism used for matching might be match, if let, or let else.

struct Wire {
  bool ready;
  unsigned int value;

  explicit operator bool() const { return ready; }
};

int main() {
  Wire w{false, 0};
  // ...

  if (w) {
    // use w.value
  } else {
    // do something else
  }
}
enum Wire {
    Ready(u32),
    NotReady,
}

fn main() {
    let wire = Wire::NotReady;
    // ...

    // match
    match wire {
        Wire::Ready(v) => {
            // use value v
        }
        Wire::NotReady => {
            // do something else
        }
    }

    // if let
    if let Wire::Ready(v) = wire {
        // use value v
    }

    // let else
    let Wire::Ready(v) = wire else {
        // do something that doesn't continue,
        // like early return
        return;
    };
}

User-defined conversions

User-defined conversions are covered in a separate chapter.

User-defined conversions

In C++ user-defined conversions are created using converting constructors or conversion functions. Because converting constructors are opt-out (via the explicit specifier), implicit conversions occur with regularity in C++ code. In the following example both the assignments and the function calls make use of implicit conversions as provided by a converting constructor.

Rust makes significantly less use of implicit conversions. Instead most conversions are explicit. The std::convert module provides several traits for working with user-defined conversions. In Rust, the below example makes use of explicit conversions by implementing the From trait.

struct Widget {
  Widget(int) {}
  Widget(int, int) {}
};

void process(Widget w) {}

int main() {
  Widget w1 = 1;
  Widget w2 = {4, 5};
  process(1);
  process({4, 5});

  return 0;
}
struct Widget;

impl From<i32> for Widget {
    fn from(_x: i32) -> Widget {
        Widget
    }
}

impl From<(i32, i32)> for Widget {
    fn from(_x: (i32, i32)) -> Widget {
        Widget
    }
}

fn process(w: Widget) {}

fn main() {
    let w1: Widget = 1.into();
    // For construction this is more idiomatic:
    let w1b = Widget::from(1);

    let w2: Widget = (4, 5).into();
    // For construction this is more idiomatic:
    let w2b = Widget::from((4, 5));

    process(1.into());
    process((4, 5).into());
}

The into method used above is provided via a blanket implementations for the Into trait for types that implement the From trait. Because of the existence of the blanket implementation, it is generally preferred to implement the From trait instead of the Into trait, and let the Into trait be provided by that blanket implementation.

Conversion functions

C++ conversion functions enable conversions in the other direction, from the defined class to another type.

To achieve the same in Rust, the From trait can be implemented in the other direction. At least one of the source type or the target type must be defined in the same crate as the trait implementation.

#include <utility>

struct Point {
  int x;
  int y;

  operator std::pair<int, int>() const {
    return std::pair(x, y);
  }
};

void process(std::pair<int, int>) {}

int main() {
  Point p1{1, 2};
  Point p2{3, 4};

  std::pair<int, int> xy = p1;
  process(p2);

  return 0;
}
struct Point {
    x: i32,
    y: i32,
}

impl From<Point> for (i32, i32) {
    fn from(p: Point) -> (i32, i32) {
        (p.x, p.y)
    }
}

fn process(x: (i32, i32)) {}

fn main() {
    let p1 = Point { x: 1, y: 2 };
    let p2 = Point { x: 3, y: 4 };

    let xy: (i32, i32) = p1.into();
    process(p2.into());
}

Conversion functions are is often used to implement the safe bool pattern in C++, which is addressed in a different way in Rust.

Borrowing conversions

The methods in the From and Into traits take ownership of the values to be converted. When this is not desired in C++, the conversion function can just take and return references.

To achieve the same in Rust the AsRef trait or AsMut trait are used.

#include <iostream>
#include <string>

struct Person {
  std::string name;

  operator std::string &() {
    return this->name;
  }
};

void process(const std::string &name) {
  std::cout << name << std::endl;
}

int main() {
  Person alice{"Alice"};

  process(alice);

  return 0;
}
struct Person {
    name: String,
}

impl AsRef<str> for Person {
    fn as_ref(&self) -> &str {
        &self.name
    }
}

fn process(name: &str) {
    println!("{}", name);
}

fn main() {
    let alice = Person {
        name: "Alice".to_string(),
    };

    process(alice.as_ref());
}

It is common to use AsRef or AsMut as a trait bound in function definitions. Using generics with an AsRef or AsMut bound allows clients to call the functions with anything that can be cheaply viewed as the type that the function wants to work with. Using this technique, the above definition of process would be defined as in the following example.

struct Person {
    name: String,
}

impl AsRef<str> for Person {
    fn as_ref(&self) -> &str {
        &self.name
    }
}

fn process<T: AsRef<str>>(name: T) {
    println!("{}", name.as_ref());
}

fn main() {
    let alice = Person {
        name: "Alice".to_string(),
    };

    process(alice);
}

This technique is often used with functions that take file system paths, so that literal strings can more easily be used as paths.

Fallible conversions

In C++ when conversions might fail it is possible (though usually discouraged) to throw an exception from the converting constructor or converting function.

Error handling in Rust does not use exceptions. Instead the TryFrom trait and TryInto trait are used for fallible conversions. These traits differ from From and Into in that they return a Result, which may indicate a failing case. When a conversion may fail one should implement TryFrom and rely on the client to call unwrap on the result, rather than panic in a From implementation.

#include <stdexcept>
#include <string>

class NonEmpty {
  std::string s;

public:
  NonEmpty(std::string s) : s(s) {
    if (this->s.empty()) {
      throw std::domain_error("empty string");
    }
  }
};

int main() {
  std::string s("");
  NonEmpty x = s; // throws

  return 0;
}
use std::convert::TryFrom;
use std::convert::TryInto;

struct NonEmpty {
    s: String,
}

#[derive(Clone, Copy, Debug)]
struct NonEmptyStringError;

impl TryFrom<String> for NonEmpty {
    type Error = NonEmptyStringError;

    fn try_from(
        s: String,
    ) -> Result<NonEmpty, NonEmptyStringError>
    {
        if s.is_empty() {
            Err(NonEmptyStringError)
        } else {
            Ok(NonEmpty { s })
        }
    }
}

fn main() {
    let res: Result<
        NonEmpty,
        NonEmptyStringError,
    > = "".to_string().try_into();
    match res {
        Ok(ne) => {
            println!("Converted!");
        }
        Err(err) => {
            println!("Couldn't convert");
        }
    }
}

Just like with From and Into, there is a blanket implementation for TryInto for everything that implements TryFrom.

Implicit conversions

Rust does have one kind of user-defined implicit conversion, called deref coercions, provided by the Deref trait and DerefMuttrait. These coercions exist for making pointer-like types more ergonomic to use.

An example of implementing the traits for a custom pointer-like type is given in the Rust book.

Summary

A summary of when to use which kind of conversion interface is given in the documentation for the std::convert module.

Overloading

C++ supports overloading of functions, so long as the invocations of the functions can be distinguished by the number or types of their arguments.

Rust does not support this kind of function overloading. Instead, Rust has a few different mechanisms (some of which C++ also has) for achieving the effects of overloading in a way that interacts better with type inference. The mechanisms usually involve making the commonalities between the overloaded functions apparent in the code.

#include <string>

double twice(double x) {
  return x + x;
}

int twice(int x) {
  return x + x;
}
#![allow(unused)]
fn main() {
fn twice(x: f64) -> f64 {
    x + x
}

// error[E0428]: the name `twice` is defined multiple times
// fn twice(x: i32) -> i32 {
//     x + x
// }
}

In practice, an example like the above would also likely be implemented in a more structured way even in C++, using templates.

When phrased this way, the example can be translated to Rust, with the notable addition of requiring a trait bound on the type.

template <typename T>
T twice(T x) {
  return x + x;
}
#![allow(unused)]
fn main() {
fn twice<T>(x: T) -> T::Output
where
    T: std::ops::Add<T>,
    T: Copy,
{
    x + x
}
}

Overloaded methods

In C++ it is possible to have methods with the same name but different signatures on the same type. In Rust there can be at most one method with the same name for each trait implementation and at most one inherent method with the same name for a type.

In cases where there are multiple methods with the same names because the method is defined for multiple traits, the desired method must be distinguished at the call site by specifying the trait.

trait TraitA {
    fn go(&self) -> String;
}

trait TraitB {
    fn go(&self) -> String;
}

struct MyStruct;

impl MyStruct {
    fn go(&self) -> String {
        "Called inherent method".to_string()
    }
}

impl TraitA for MyStruct {
    fn go(&self) -> String {
        "Called Trait A method".to_string()
    }
}

impl TraitB for MyStruct {
    fn go(&self) -> String {
        "Called Trait B method".to_string()
    }
}

fn main() {
    let my_struct = MyStruct;

    // Calling the inherent method
    println!("{}", my_struct.go());

    // Calling the method from TraitA
    println!("{}", TraitA::go(&my_struct));

    // Calling the method from TraitB
    println!("{}", TraitB::go(&my_struct));
}

One exception to this is when the methods are all from the same generic trait with with different type parameters for the implementations. In that case, if the signature is sufficient to determine which implementation to use, the trait does not need to be specified to resolve the method. This is common when using the From trait.

struct Widget;

impl From<i32> for Widget {
    fn from(x: i32) -> Widget {
        Widget
    }
}

impl From<f32> for Widget {
    fn from(x: f32) -> Widget {
        Widget
    }
}

fn main() {
    // Calls <Widget as From<i32>>::from
    let w1 = Widget::from(5);
    // Calls <Widget as From<f32>>::from
    let w2 = Widget::from(1.0);
}

Overloaded operators

In C++ most operators can either be overloaded either with a free-standing function or by providing a method defining the operator on a class.

Rust provides operator via implementation of specific traits. Implementing a method of the same name as required by the trait will not make a type usable with the operator if the trait is not implemented.

struct Vec2 {
  double x;
  double y;

  Vec2 operator+(const Vec2 &other) const {
    return Vec2{x + other.x, y + other.y};
  }
};

int main() {
  Vec2 a{1.0, 2.0};
  Vec2 b{3.0, 4.0};
  Vec2 c = a + b;
}
#[derive(Clone, Copy)]
struct Vec2 {
    x: f64,
    y: f64,
}

impl std::ops::Add for &Vec2 {
    type Output = Vec2;

    // Note that the type of self here is &Vec2.
    fn add(self, other: Self) -> Vec2 {
        Vec2 {
            x: self.x + other.x,
            y: self.y + other.y,
        }
    }
}

fn main() {
    let a = Vec2 { x: 1.0, y: 2.0 };
    let b = Vec2 { x: 3.0, y: 4.0 };
    let c = &a + &b;
}

Additionally, sometimes it is best to provide trait implementations for various combinations of reference types, especially for types that implement the Copy trait, since they are likely to want to be used either with or without taking a reference. For the example above, that involve defining four implementations.

#[derive(Clone, Copy)]
struct Vec2 {
    x: f64,
    y: f64,
}

impl std::ops::Add<&Vec2> for &Vec2 {
    type Output = Vec2;

    fn add(self, other: &Vec2) -> Vec2 {
        Vec2 {
            x: self.x + other.x,
            y: self.y + other.y,
        }
    }
}

// If Vec2 weren't so small, it might be desireable to re-use space in the below
// implementations, since they take ownership.

impl std::ops::Add<Vec2> for &Vec2 {
    type Output = Vec2;

    fn add(self, other: Vec2) -> Vec2 {
        Vec2 {
            x: self.x + other.x,
            y: self.y + other.y,
        }
    }
}

impl std::ops::Add<&Vec2> for Vec2 {
    type Output = Vec2;

    fn add(self, other: &Vec2) -> Vec2 {
        Vec2 {
            x: self.x + other.x,
            y: self.y + other.y,
        }
    }
}

impl std::ops::Add<Vec2> for Vec2 {
    type Output = Vec2;

    fn add(self, other: Vec2) -> Vec2 {
        Vec2 {
            x: self.x + other.x,
            y: self.y + other.y,
        }
    }
}

fn main() {
    let a = Vec2 { x: 1.0, y: 2.0 };
    let b = Vec2 { x: 3.0, y: 4.0 };
    let c = a + b;
}

The repetition can be addressed by defining a macro.

#[derive(Clone, Copy)]
struct Vec2 {
    x: f64,
    y: f64,
}

macro_rules! impl_add_vec2 {
    ($lhs:ty, $rhs:ty) => {
        impl std::ops::Add<$rhs> for $lhs {
            type Output = Vec2;

            fn add(self, other: $rhs) -> Vec2 {
                Vec2 {
                    x: self.x + other.x,
                    y: self.y + other.y,
                }
            }
        }
    };
}

impl_add_vec2!(&Vec2, &Vec2);
impl_add_vec2!(&Vec2, Vec2);
impl_add_vec2!(Vec2, &Vec2);
impl_add_vec2!(Vec2, Vec2);

fn main() {
    let a = Vec2 { x: 1.0, y: 2.0 };
    let b = Vec2 { x: 3.0, y: 4.0 };
    let c = a + b;
}

Default arguments

Default arguments in C++ are sometimes implemented in terms of function overloading.

Rust does not have default arguments. Instead, arguments with Option type can be used to provide a similar effect.

unsigned int shift(unsigned int x,
                   unsigned int shiftAmount) {
  return x << shiftAmount;
}

unsigned int shift(unsigned int x) {
  return shift(x, 2);
}

int main() {
  unsigned int a = shift(7); // shifts by 2
}
use std::ops::Shl;

fn shift(
    x: u32,
    shift_amount: Option<u32>,
) -> u32 {
    let a = shift_amount.unwrap_or(2);
    x.shl(a)
}

fn main() {
    let res = shift(7, None); // shifts by 2
}

Unrelated overloads

The lack of completely ad hoc overloading in Rust encourages the definition of traits that capture essential commonalities between types, so that functions can be implemented in terms of those interfaces and used generally. However, it also sometime encourages the anti-pattern of defining of traits that only capture incidental commonalities (such as having methods of the same name).

It is better programming practice in those cases to simply define separate functions, rather than to shoehorn in a trait where no real commonality exists.

This is commonly seen in Rust in the naming conventions for constructor static methods. Instead of them all being named new with different arguments, they are usually given names of the form from_something, where the something varies based on from what the value is being constructed, or a more specific name if appropriate.

#![allow(unused)]
fn main() {
struct Vec3 {
    x: f64,
    y: f64,
    z: f64,
}

impl Vec3 {
    fn from_x(x: f64) -> Vec3 {
        Vec3 { x, y: 0.0, z: 0.0 }
    }

    fn from_y(y: f64) -> Vec3 {
        Vec3 { x: 0.0, y, z: 0.0 }
    }

    fn diagonal(d: f64) -> Vec3 {
        Vec3 { x: d, y: d, z: d }
    }
}
}

This differs from the conversion methods supported by the From and Into traits, which have the additional purpose of supporting trait bounds on generic functions which should take any type convertible to a specific type.

Object identity

In C++ the pointer to an object is sometimes used to represent its identity in terms of the logic of a program.

In some cases, this is a standard optimization, such as when implementing the copy assignment operator.

In other cases the pointer value is used as a logical identity to distinguish between specific instances of an object that otherwise have the same properties. For example, representing a labeled graph where there may be distinct nodes that have the same label.

In Rust, some of these cases are not applicable, and others cases are typically handled by instead by implementing a synthetic notion of identity for the values.

Overloading copy assignment and equality comparison operators

For example, when implementing the copy-assignment operator, one might short-circuit when the copied object and the assignee are the same. Note that in this use the pointer values are not stored.

This kind of optimization is unnecessary when implementing Rust's equivalent to the copy assignment operator Clone::clone_from. The type of Clone::clone_from prevents the same object from being passed as both arguments, because one of the arguments is a mutable reference, which is exclusive, and so prevents the other reference argument from referring to the same object.

struct Person
{
    std::string name;
    // many other expensive-to-copy fields

    Person& operator=(const Person& other) {
        // compare object identity first
        if (this != &other) {
            this.name = other.name;
            // copy the other expensive-to-copy fields
        }

        return *this;
    }
};
#![allow(unused)]
fn main() {
struct Person {
    name: String,
}

impl Clone for Person {
    fn clone(&self) -> Self {
        Self { name: self.name.clone() }
    }

    fn clone_from(&mut self, source: &Self) {
        // self and source cannot be the same here,
        // because that would mean there are a
        // mutable and an immutable reference to
        // the same memory location. Therefore, a
        // check for assignment to self is not
        // needed, even for the purpose of
        // optimization.

        self.name.clone_from(&source.name);
    }
}
}

In cases in C++ where most comparisons are between an object and itself (e.g., the object's primary use is to be stored in a hash set), and comparison of unequal objects is expensive, comparing object identity might be used as optimization for the equality comparison operator overload.

For supporting similar operations in Rust, std::ptr::eq can be used.

struct Person
{
    std::string name;
    // many other expensive-to-compare fields
};


bool operator==(const Person& lhs, const Person& rhs) {
    // compare object identity first
    if (&lhs == &rhs) {
        return true;
    }

    // compare the other expensive-to-compare fields

    return true;
}
#![allow(unused)]
fn main() {
struct Person {
    name: String,
    // many other expensive-to-compare fields
}

impl PartialEq for Person {
    fn eq(&self, other: &Self) -> bool {
        if std::ptr::eq(self, other) {
            return true;
        }
        // compare other expensive-to-compare fields

        true
    }
}

impl Eq for Person {}
}

Distinguishing between values in a relational structure

The other use is when relationships between values are represented using a data structure external to the values, such as when representing a labeled graph in which multiple nodes might share the same label, but have edges between different sets of other nodes. This differs from the earlier case because the pointer value is preserved.

One real-world example of this is in the LLVM codebase, where occurrences of declarations, statements, and expressions in the AST are distinguished by object identity. For example, variable expressions (class DeclRefExpr) contain the pointer to the occurrence of the declaration to which the variable refers.

Similarly, when comparing whether two variable declarations represent declarations of the same variable, a pointer to some canonical VarDecl is used:

VarDecl *VarDecl::getCanonicalDecl();

bool CapturedStmt::capturesVariable(const VarDecl *Var) const {
  for (const auto &I : captures()) {
    if (!I.capturesVariable() && !I.capturesVariableByCopy())
      continue;
    if (I.getCapturedVar()->getCanonicalDecl() == Var->getCanonicalDecl())
      return true;
  }

  return false;
}

This kind of use is often discouraged in C++ because of the risk of use-after-free bugs, but might be used in performance sensitive applications where either storing the memory to represent the mapping or the additional indirection to resolve an entity's value from its identity is cost prohibitive.

In Rust it is generally preferred to represent the identity of the objects with synthetic identifiers. This is in part as a technique for modeling self-referential data structures.

As an example, one popular Rust graph library petgraph uses u32 as its default node identity type. This incurs the cost of an extra call to dereference the synthetic identifier to the label of the represented node as well as the extra memory required to store the mapping from nodes to labels.

A simplified graph representation using the same synthetic identifier technique would look like the following, which represents the node identities by their index in the vectors that represent the labels and the edges.

#![allow(unused)]
fn main() {
enum Color {
    Red,
    Blue
}

struct Graph {
    /// Maps from node id to node labels, which here are colors.
    nodes_labels: Vec<Color>,

    /// Maps from node id to adjacent nodes ids.
    edges: Vec<Vec<usize>>,
}
}

If performance requirements make the use of synthetic identifiers unacceptable, then it may be necessary to use prevent the value from being moved. The Pin and PhantomPinned structs can be used to achieve an effect similar to deleting the move constructor in C++.

Out parameters

There are several idioms in C++ that involve the use of out parameters: passing pointers or references to functions for the function to mutate to provide its results.

The chapters in this section address idiomatic ways to achieve the same goals that out parameters are used for in C++. Many of the Rust idioms resemble the recommended alternatives to out parameters when programming against newer C++ standards.

Multiple return values

One idiom for returning multiple values from a function or method in C++ is to pass in references to which the values can be assigned.

There are several reasons why this idiom might be used:

  • compatibility with versions of C++ earlier than C++11,
  • working in a codebase that uses C-style of C++, or
  • performance concerns.

The idiomatic translation of this program into Rust makes use of either tuples or a named structure for the return type.

void get_point(int &x, int &y) {
  x = 5;
  y = 6;
}

int main() {
  int x, y;
  get_point(x, y);
  // ...
}
fn get_point() -> (i32, i32) {
    (5, 6)
}

fn main() {
    let (x, y) = get_point();
    // ...
}

Rust has a dedicated tuple syntax and supports pattern matching with let bindings in part to support use cases like this one.

Problems with the direct transliteration

It is possible to transliterate the original example that uses out parameters to Rust, but Rust requires the initialization of the variables before they can be passed to a function. The resulting program is not idiomatic Rust.

// NOT IDIOMATIC RUST
fn get_point(x: &mut i32, y: &mut i32) {
    *x = 5;
    *y = 6;
}

fn main() {
    let mut x = 0; // initialized to arbitrary values
    let mut y = 0;
    get_point(&mut x, &mut y);
    // ...
}

This approach requires assigning arbitrary initial values to the variables and making the variables mutable, both of which make it harder for the compiler to help with avoiding programming errors.

Additionally, the Rust compiler is tuned for optimizing the idiomatic version of the program, and produces a significantly faster binary for that version.

In situations where the performance of memory allocation is a concern (such as when it is necessary to reuse entire buffers in memory), the trade-offs may be different. That situation is discussed in the chapter on pre-allocated buffers.

Similarities with idiomatic C++ since C++11

In C++11 and later, std::pair and std::tuple are available for returning multiple values instead of assigning to reference parameters.

#include <tuple>
#include <utility>

std::pair<int, int> get_point() {
  return std::make_pair(5, 6);
}

int main() {
  int x, y;
  std::tie(x, y) = get_point();
  // ...
}

This more closely aligns with the normal Rust idiom for returning multiple values.

Optional return values

One idiom in C++ for optionally producing a result from a method or function is to use a reference parameter along with a boolean or integer return value to indicate whether the result was produced. This might be done for the same reasons as for using out parameters for multiple return values:

  • compatibility with versions of C++ earlier than C++11,
  • working in a codebase that uses C-style of C++, and
  • performance concerns.

The idiomatic Rust approach for optionally returning a value is to return a value of type Option.

#include <iostream>

bool safe_divide(unsigned int dividend,
                 unsigned int divisor,
                 unsigned int &quotient) {
  if (divisor != 0) {
    quotient = dividend / divisor;
    return true;
  } else {
    return false;
  }
}

void go(unsigned int dividend,
        unsigned int divisor) {
  unsigned int quotient;
  if (safe_divide(dividend, divisor, quotient)) {
    std::cout << quotient << std::endl;
  } else {
    std::cout << "Division failed!" << std::endl;
  }
}

int main() {
  go(10, 2);
  go(10, 0);
}
fn safe_divide(
    dividend: u32,
    divisor: u32,
) -> Option<u32> {
    if divisor != 0 {
        Some(dividend / divisor)
    } else {
        None
    }
}

fn go(dividend: u32, divisor: u32) {
    match safe_divide(dividend, divisor) {
        Some(quotient) => {
            println!("{}", quotient);
        }
        None => {
            println!("Division failed!");
        }
    }
}

fn main() {
    go(10, 2);
    go(10, 0);
}

When there is useful information to provide in the failing case, the Result type can be used instead. The chapter on error handling describes the use of Result.

Returning a pointer

When the value being returned is a pointer, another common idiom in C++ is to use nullptr to represent the optional case. In the Rust translation of that idiom, Option is also used, along with a reference type, such as & or Box. See the chapter on using nullptr as a sentinel value for more details.

Problems with the direct transliteration

It is possible to transliterate the original example that uses out parameters to Rust, but the resulting code is not idiomatic.

// NOT IDIOIMATIC RUST
fn safe_divide(dividend: u32, divisor: u32, quotient: &mut u32) -> bool {
    if divisor != 0 {
        *quotient = dividend / divisor;
        true
    } else {
        false
    }
}

fn go(dividend: u32, divisor: u32) {
    let mut quotient: u32 = 0; // initliazed to arbitrary value
    if safe_divide(dividend, divisor, &mut quotient) {
        println!("{}", quotient);
    } else {
        println!("Division failed!");
    }
}

fn main() {
    go(10, 2);
    go(10, 0);
}

This shares the same problems as with using out-parameters for multiple return values.

Similarities with C++ since C++17

C++17 and later offer std::optional, which can be used to express optional return values in a way similar to the idiomatic Rust example.

#include <iostream>
#include <optional>

std::optional<unsigned int> safe_divide(unsigned int dividend,
                                        unsigned int divisor) {
  if (divisor != 0) {
    return std::optional<unsigned int>(dividend / divisor);
  } else {
    return std::nullopt;
  }
}

void go(unsigned int dividend, unsigned int divisor) {
  if (auto quotient = safe_divide(dividend, divisor)) {
    std::cout << *quotient << std::endl;
  } else {
    std::cout << "Division failed!" << std::endl;
  }
}

int main() {
  go(10, 2);
  go(10, 0);
}

Helpful Option utilities

Rust provides several syntactic sugars for simplifying use of functions that return Option. If a failure should be propagated to the caller, then use the ? operator:

#![allow(unused)]
fn main() {
fn safe_divide(dividend: u32, divisor: u32) -> Option<u32> {
    if divisor != 0 {
        Some(dividend / divisor)
    } else {
        None
    }
}

fn go(dividend: u32, divisor: u32) -> Option<()> {
    let quotient = safe_divide(dividend, divisor)?;
    println!("{}", quotient);
    Some(())
}
}

If None should not be propagated, it is sometimes clearer to use let-else syntax:

fn safe_divide(dividend: u32, divisor: u32) -> Option<u32> {
    if divisor != 0 {
        Some(dividend / divisor)
    } else {
        None
    }
}

fn go(dividend: u32, divisor: u32) {
    let Some(quotient) = safe_divide(dividend, divisor) else {
        println!("Division failed!");
        return;
    };
    println!("{}", quotient);
}

fn main() {
    go(10, 2);
    go(10, 0);
}

If there is a default value that should be used in the None case, the Option::unwrap_or, Option::unwrap_or_else, Option::unwrap_or_default, or Option::unwrap methods can be used:

fn safe_divide(dividend: u32, divisor: u32) -> Option<u32> {
    if divisor != 0 {
        Some(dividend / divisor)
    } else {
        None
    }
}

fn expensive_computation() -> u32 {
    // ...
   0
}

fn go(dividend: u32, divisor: u32) {
    // If None, returns the given value.
    let result = safe_divide(dividend, divisor).unwrap_or(0);

    // If None, returns the result of calling the given function.
    let result2 = safe_divide(dividend, divisor).unwrap_or_else(expensive_computation);

    // If None, returns Default::default(), which is 0 for u32.
    let result3 = safe_divide(dividend, divisor).unwrap_or_default();

    // If None, panics. Prefer the other methods!
    // let result3 = safe_divide(dividend, divisor).unwrap();
}

fn main() {
    go(10, 2);
    go(10, 0);
}

In performance-sensitive code where you have manually checked that the result is guaranteed to be Some, Option::unwrap_unchecked can be used, but is an unsafe method.

There are additional utility methods that enable concise handling of Option values, which this book covers in the chapter on exceptions and error handling.

An alternative approach

An alternative approach in Rust to returning optional values is to require that the caller of a function prove that the value with which they call a function will not result in the failing case.

For the above safe division example, this involves the caller guaranteeing that the provided divisor is non-zero. In the following example this is done with a dynamic check. In other contexts the evidence needed may be available statically, provided from callers further upstream, or used more than once. In those cases, this approach reduces both runtime cost and code complexity.

use std::convert::TryFrom;
use std::num::NonZero;

fn safe_divide(dividend: u32, divisor: NonZero<u32>) -> u32 {
    // This is more efficient because the overflow check is skipped.
    dividend / divisor
}

fn go(dividend: u32, divisor: u32) {
    let Ok(safe_divisor) = NonZero::try_from(divisor) else {
        println!("Can't divide!");
        return;
    };

    let quotient = safe_divide(dividend, safe_divisor);
    println!("{}", quotient);
}

fn main() {
    go(10, 2);
    go(10, 0);
}

Pre-allocated buffers

There are situations where large quantities of data need to be returned from a function that will be called repeatedly, so that incurring the copies involved in returning by value or repeated heap allocations would be cost prohibitive. Some of these situations include:

  • performing file or network IO,
  • communicating with graphics hardware,
  • communicating with hardware on embedded systems, or
  • implementing cryptography algorithms.

In these situations, C++ programs tend to pre-allocate buffers that are reused for all calls. This also usually enables allocating the buffer on the stack, rather than having to use dynamic storage.

The following example pre-allocates a buffer and reads a large file into it within a loop.

#include <fstream>

int main() {
  std::ifstream file("/path/to/file");
  if (!file.is_open()) {
    return -1;
  }

  byte buf[1024];
  while (file.good()) {
    file.read(buf, sizeof buf);
    std::streamsize count = file.gcount();

    // use data in buf
  }

  return 0;
}
use std::fs::File;
use std::io::{BufReader, Read};

fn main() -> Result<(), std::io::Error> {
    let mut f = BufReader::new(File::open(
        "/path/to/file",
    )?);

    let mut buf = [0u8; 1024];

    loop {
        let count = f.read(&mut buf)?;
        if count == 0 {
            break;
        }

        // use data in buf
    }

    Ok(())
}

The major difference between the C++ program and the Rust program is that in the Rust program the buffer must be initialized before it can be used. In most cases, this one-time initialization cost is not significant. When it is, unsafe Rust is required to avoid the initialization.

The technique for avoiding initialization makes use of std::mem::MaybeUninit. Examples of safe usage of MaybeUninit are given in the API documentation for the type.

The IO API in stable Rust does not include support for MaybeUninit. Instead, there is a new safe API being developed that will enable avoiding initialization without requiring unsafe Rust in code that uses the API.

If the callee might need to grow the provided buffer and dynamic allocation is allowed, then a &mut Vec<T> can be used instead of &mut [T]. This is similar to providing a std::vector<T>& in C++. To avoid unnecessary reallocation, the vector can be created using Vec::<T>::with_capacity(n).

A note on reading files

While the examples here use IO to demonstrate re-using pre-allocated buffers, there are higher-level interfaces available for reading from Files, both from the Read and BufRead traits, and from convenience functions in std::io and in std::fs.

The techniques described here are useful, however, in other situations where a reusable buffer is required, such as when interacting with hardware APIs, when using existing C or C++ libraries, or when implementing algorithms that produce larges amount of data in chunks, such as cryptography algorithms.

Upcoming changes and BorrowedBuf

The Rust community is refining approaches to working with uninitialized buffers. On the nightly branch of Rust, one can use BorrowedBuf to achieve the same results as when using slices of MaybeUninit, but without having to write any unsafe code. The IO APIs for avoiding unnecessary initialization use BorrowedBuf instead of slices of MaybeUninit.

Curiously recurring template pattern (CRTP)

The C++ curiously recurring template pattern is used to make the concrete type of the derived class available in the definition of methods defined in the base class.

Sharing implementations with static polymorphism

The basic use of the CRTP is for reducing redundancy in implementations that make use of static polymorphism. In this use case, the this pointer is cast to the type provided by the template parameter so that methods from the derived class can be called. This enables methods implemented in the base class to call methods in the derived class without having to declare them virtual, avoiding the cost of dynamic dispatch.

In the following example, Triangle and Square have a common implementation of twiceArea without the need for dynamic dispatch. This use case is addressed in Rust using default trait methods.

#include <iostream>

template <typename T>
struct Shape {
  // This implementation is shared and can call
  // the area method from derived classes without
  // declaring it virtual.
  double twiceArea() {
    return 2.0 * static_cast<T *>(this)->area();
  }
};

struct Triangle : public Shape<Triangle> {
  double base;
  double height;

  Triangle(double base, double height)
      : base(base), height(height) {}

  double area() {
    return 0.5 * base * height;
  }
};

struct Square : public Shape<Square> {
  double side;

  Square(double side) : side(side) {}

  double area() {
    return side * side;
  }
};

int main() {
  Triangle triangle{2.0, 1.0};
  Square square{2.0};

  std::cout << triangle.twiceArea() << std::endl;
  std::cout << square.twiceArea() << std::endl;
}
trait Shape {
    fn area(&self) -> f64;

    fn twice_area(&self) -> f64 {
        2.0 * self.area()
    }
}

struct Triangle {
    base: f64,
    height: f64,
}

impl Shape for Triangle {
    fn area(&self) -> f64 {
        0.5 * self.base * self.height
    }
}

struct Square {
    side: f64,
}

impl Shape for Square {
    fn area(&self) -> f64 {
        self.side * self.side
    }
}

fn main() {
    let triangle = Triangle {
        base: 2.0,
        height: 1.0,
    };
    let square = Square { side: 2.0 };
    println!("{}", triangle.twice_area());
    println!("{}", square.twice_area());
}

The reason why nothing additional needs to be done for the default method to invoke area statically in Rust is that calls to methods on self are always resolved statically in Rust. This is possible because Rust does not have inheritance between concrete types. Despite being defined in the trait, the default method is actually implemented as part of the implementing struct.

Method chaining

Another common use for the CRTP is for implementing method chaining when an implementation of a method to be chained is provided by a base class.

In C++ the template parameter is used to ensure that the type returned from the shared function is that of the derived class, so that further methods defined in the derived class can be called on it. The template parameter is also used to call a method on the derived type without declaring the method as virtual.

In Rust the template parameter is not required because the Self type is available in traits to refer to the type of the implementing struct.

#include <iostream>
#include <span>
#include <string>
#include <vector>

// D is the type of the derived class
template <typename D>
struct Combinable {
  D combineWith(D &d);

  // concat is implemented in the base class, but
  // operates on values of the derived class.
  D concat(std::span<D> vec) {
    D acc(*static_cast<D *>(this));

    for (D &v : vec) {
      acc = acc.combineWith(v);
    }

    return acc;
  }
};

struct Sum : Combinable<Sum> {
  int sum;

  Sum(int sum) : sum(sum) {}

  Sum combineWith(Sum s) {
    return Sum(sum + s.sum);
  }

  // Sum includes an additional method that can be
  // chained.
  Sum mult(int n) {
    return Sum(sum * n);
  }
};

int main() {
  Sum s(0);
  std::vector<Sum> v{1, 2, 3, 4};
  Sum x = s.concat(v)
              // Even though concat is part of the
              // base class, it returns a value of
              // the implementing class, making it
              // possible to chain methods
              // specific to that class.
              .mult(2)
              .combineWith(5);
  std::cout << x.sum << std::endl;
}
// No generic type is required: Self already
// refers to implementing type.
trait Combinable {
    fn combine_with(&self, other: &Self) -> Self;

    // concat has a default implementation in
    // terms of Self.
    fn concat(&self, others: &[Self]) -> Self
    where
        Self: Clone,
    {
        let mut acc = self.clone();

        for v in others {
            acc = acc.combine_with(v);
        }
        acc
    }
}

#[derive(Clone)]
struct Sum(i32);

impl Sum {
    // Sum includes an additional method that can be
    // chained.
    fn mult(&self, n: i32) -> Self {
        Self(self.0 * n)
    }
}

impl Combinable for Sum {
    fn combine_with(&self, other: &Self) -> Self {
        Self(self.0 + other.0)
    }
}

fn main() {
    let s = Sum(0);
    let v = vec![Sum(1), Sum(2), Sum(3), Sum(4)];
    let x = s
        .concat(&v)
        // Even though concat is part of the
        // trait, it returns a value of the
        // implementing type, making it possible
        // to chain methods specific to that type.
        .mult(2)
        .combine_with(&Sum(5));
    println!("{}", x.0)
}

Again, the reason why Self can refer to the implementing type is that Rust does not have inheritance between concrete types. This contrasts with C++ where a value may be used at any number of types which are concrete, and so it would not be clear which type something like Self should refer to.

Libraries

C++ programs tend to either use libraries that come with operating system distributions or that are vendored.

Rust programs tend to rely on a central registry of Rust libraries ("crates") called crates.io (along with a central documentation repository created from the in-code documentation of those crates called docs.rs). Dependencies on crates are managed using the Cargo package manager.

Lib.rs is a good resource for finding popular crates organized by category.

Some specific alternatives

C++ libraryRust alternative
STL UTF-16 and UTF-32 stringswidestring
STL randomrand
STL regexregex
Boost.Testcargo test
pybind11PyO3
OpenSSLrustls

If there is a C++ library that you use where you cannot find a Rust alternative, please leave feedback using the link below, letting us know the name and purpose of the library.

Supply chain management

In situations where managing the library supply chain is important, Cargo can be used either with custom self-managed or organization-managed registries or with vendored versions of dependencies fetched from crates.io.

Both approaches provide mechanisms for reviewing dependencies as part supply chain security.

Solutions for supply chain security that do not involve vendoring or custom registries are in progress.

Attribution notices

This book makes use of the Standard C++ Foundation logo under their posted terms of use.

This book makes use of the Rust logo, including a modified version of the logo, under the Creative Commons CC-BY license, as posted in the rust-artwork repository and under the posted terms of use for the trademark.