Section 9: Safer Systems Programming (featuring Rust)

⚠️ This section is meant to be done in-person with other students. ⚠️

Introduction

By this point in the semester, you've done a significant amount of programming in C/C++ and are likely very familiar with our good friend, GDB. We've debugged issues like unknown segfaults and buffer overwrites and will soon tackle race conditions and deadlocks! A common link between all these errors is that they are a consequence of giving the programmer unfettered access to the underlying memory representation of data, and putting the responsibility on the programmer to manage it properly. This can lead to serious vulnerabilities in widely used software.

In this lab, you will explore some new ways in which programming language designers have tried to make low-level systems programming easier and less error-prone, while maintaining its main advantages: a high degree of control over the memory representation of data and the opportunity for low-level performance optimization.

In particular, we will explore concepts in Rust, a relatively new systems programming language. Rust 1.0 released in 2015, and its language designers had the benefit of 40 years of hindsight compared to C, no need to maintain backwards compatibility, and the opportunity to enshrine in the language some of the ideas that systems programmers had discovered to be useful in the past few decades.

An important disclaimer: We will use Rust to exemplify key concepts in enforcing memory-safety in a systems language, but this lab is not meant to be a Rust tutorial. Some of the ideas you'll encounter in this lab also exist in modern C++ and in other systems languages, such as Go.

If you would like to learn more about Rust, we have linked some resources in the Appendix section at the bottom of this handout!

Assignment

Assignment Installation

First, ensure that your repository has a handout remote. Type:

$ git remote show handout

If this reports an error, run:

$ git remote add handout 
https://github.com/csci0300/cs300-s24-section.git

Then run:

$ git pull
$ git pull handout master

This will merge our Section 7 stencil code with your previous work. If you have any conflicts from previous labs, resolve them before continuing further. Run git push to save your work back to your personal repository.

Part 1: Basic Memory Safety

In these exercises, you will walk through two programs that address classic errors in memory management, for which we have provided the code in C as well as the analogous code in Rust. You might find some of the Rust syntax new, so it will be helpful to compare the two programs side-by-side so that your knowledge of C code can help you make sense of the Rust code! The Rust and C programs do the exact same thing.

Overview of Lifetimes

As you know well by now, any memory that is allocated during a program must at some point be freed. Think back all the way to our discussions about the different segments of memory (data, stack, heap). One of the distinguishing factors between these segments is how long the memory allocated to those regions persists.

Segment	Duration	Method of Freeing
Data	Duration of program	Automatic when program ends
Stack	$\leq$ Duration of function call	Automatic when data goes out of scope
Heap	Until freed	Manual call to `free()`

Looking at the second column of that table lets us introduce the concept of lifetimes. The lifetime of a piece of memory is the span of time after it is initialized but before it is freed. For instance:

Many local variables have a lifetime that's equivalent to the length of a function execution.
A loop index variable has a lifetime equivalent to the duration of the loop.
Global data has the same lifetime as the entire program (this is sometimes called the "static" lifetime).

Task 1.1: Among these three types of memory, it is most difficult to determine the lifetime of heap data. Discuss with your group why this is.

Why do we care about the lifetimes of data? As it turns out, most of the memory bugs we worry about in C and C++ can be framed in terms of lifetime violations! You'll explore a couple of such examples in Part 1 of this section. Before doing so, however, we will introduce one more foundational concept that systems programmers use to reason about memory safety: Ownership.

Ownership

One of the foundational concepts of memory safety in Rust is the notion of ownership of memory. The owner of a piece of memory is responsible for making sure that it gets freed when the memory is no longer needed. Each value is owned by a variable, and can only have one such owner at a time. The variable that owns a value determines its lifetime.

Typical memory errors in low-level languages like C/C++ are often caused by mistaken notions of ownership. Consider the situation where there is no owner, or multiple pieces of code that believe they are the owners:

Situation	Consequence
Two or more owners	Use-after free, double free
No owners	Memory leak (missing free)

Over time, systems programmers realized that careful reasoning about ownership is a good way to avoid memory safety issues like the above. Indeed, Google's most recent C++ style guide says:

It's virtually impossible to manage dynamically allocated memory without some sort of ownership logic.

The C++ language consequently acquired optional notions of ownership, the most common of which are either careful documentation of ownership via comments on APIs (e.g., "the consume function takes ownership of the memory at the data pointer") or the use of smart pointers, which are special classes (e.g., unique_ptr<int> and shared_ptr<int>) that replace "raw" pointers (such as int*) and encode notions of ownership.

In the following exercise, you will explore Rust's notion of ownership, and specifically, how changes of ownership ("moving") work.

Exercise: What is "Moving"?

For the first set of tasks, you will look at move.c. Make sure to read the comments in the code so that you are familiar with what the program is supposed to do.

Task 1.2: Before running any of the code, answer the following questions and record your answers:

What is the memory management error in the C program? Why is it a problem?
What could be a possible way to prevent this type of error? (There's no incorrect answer here, so feel free to use your imagination.)

Now, from within the 1-move directory, compile and run the C program using the commands shown below. The first command will compile the executable move_c, and the second command will run it.

$ make c
$ ./move_c

Now, answer the following question:

What does the program output? Was this expected, or could you imagine another output?

Memory management errors that compile and run without visible errors lead to undefined behavior being introduced into programs. Our example here was small, but if this code was part of a much bigger program where some other functionality depended on a correct output, then much more significant problems could arise.

Rust developers had the advantage of being able to learn from the mistakes made with C, and thus built the Rust compiler to prevent memory-management errors at compile-time.

Now, pull up move.rs and compile the program using the command make rust.

Task 1.3: What happens when you compile the Rust program? Look at the error message and discussion with your group how you think Rust can detect this!

We recommend looking at the "Helpful Background" dropdown for some additional context on this type of memory error.

Helpful Background

This is an example of aliasing. In the C program, we had a pointer to some allocated memory, and then manually set another pointer to point to the same memory. Having multiple references to the same data is dangerous precisely for the reasons you've just seen—if we free some memory while there's still another reference pointing to it, then we now have a seemingly valid reference that's actually pointing to invalid memory. In C, these types of errors can silently compile and run!

⚠️ Stop here until your section TAs have summarized the ownership part! ⚠️

Exercise: Enforcing Lifetimes

You have just explored an example of a use-after-free lifetime violation, where we tried to read from heap-allocated memory whose lifetime had ended upon a call to free().

Looking at the above table, we can spot another situation where lifetime violations can cause memory safety issues: returning references to values on the stack. Since values on the stack live for at most the duration of a function call, returning pointers to stack values (like local variables) can cause segfaults when trying to access that memory elsewhere in the program (it's possible you may have experienced this type of problem this semester!).

In this exercise, you'll explore how Rust's ownership guarantees can make reasoning about lifetimes at compile time (i.e., "statically") easier and even allow the compiler to warn you about lifetime errors at compile time!

Now, read through lifetimes.c until you feel like you have a good sense of what the program is doing. Then, read through lifetimes.rs. Don't be scared by the syntax, lifetimes.rs implements the same functionality as lifetimes.c. If you are struggling to understand what's happening in the Rust program, use the C program as reference. Below, we also explain some of the unfamiliar syntax that you might find in the Rust program.

There are two constructs in the Rust code that may be unfamiliar. They are both in the List struct definition.

Option is a type that specifies that a value could be null. Since references in Rust can't be NULL like they can in C, we use Option instead. The Rust NULL equivalent is None, and its counterpart is Some(...). Therefore, if we have something of type Option<List>, it can either be None or Some(List {...}).
Syntax that looks like 'a is Rust's way of manually annotating lifetimes. You don't need to worry about it for this example, since we'll focus on a case where the Rust compiler can give you lifetime warnings without any help from the developer. Anytime you see 'a, feel free to ignore it.

I want to know more about Rust's lifetime annotation syntax

As we discussed before, sometimes reasoning about lifetimes can be complex, and Rust provides a method for the developer to manually specify lifetimes when they are too complicated for the compiler to infer.

The lifetime annotations on the struct are quite simple: first we introduce a lifetime parameter called 'a ("alpha"). This is what struct List<'a> does. Then, we specify that the reference inside the option needs to have a lifetime of at least 'a (this is what the syntax &'a List means). Finally, we declare that the next element in the list is parameterized by the same lifetime 'a (this is what List<'a> means).

Now, the Rust compiler can make sure we don't have any lifetime violations with the List struct. This happens at compile-time, without running the program. What kind of lifetime violations are we worried about? Any time a struct contains a reference, we have to make sure that the struct doesn't outlive the reference, or else we'll have a dangling reference.

To help us, when we create a new List node, the Rust compiler will examine the lifetime of the reference we use inside and ensure that the List doesn't outlive that reference!

Task 2.1: Before running any of the code, discuss the following questions and record you answer:

What is wrong with lifetimes.c? What do you expect the output to be?

Now, compile and runlifetimes.c and lifetimes.rs.

cd 2-lifetimes
# compile and run the c program
make c
./lifetimes_c
# compile the Rust program
make rust

Task 2.2: Discuss the following questions and record your answer:

Does the output of the C program match what you expect? If not, how can you explain the difference?
What is the output of the Rust compiler? How is the Rust compiler able to give the warning it does?

⚠️ Stop here until your section TAs have summarized the lifetimes part! ⚠️

Part 2: Safer Concurrency

Having seen how systems languages can address basic memory bugs by enforcing ownership and lifetimes, we are now ready to look at how language developers try to make concurrency safer!

Exercise: Locking Discipline

In this exercise, you will look at concurrency.cpp and concurrency.rs. Similar to before, these examples do the exact same thing, but you might find that the Rust code has some unfamiliar terms. We highly recommend you use your knowledge about C++ to make sense of the Rust code, as well as read all the comments that we've annotated the code with!

First, look at concurrency.cpp. This program is longer than the programs you've seen before, but it is not complex: it is a money transfer application! Here is a quick overview of the functions and structs you are given:

account_t: This struct contains two fields, a balance (meant to represent how much money the account contains), and a balance_mtx which is the mutex that protects the balance.
main(): Initializes a single account with a starting balance and a vector of integers that represent deposits/withdrawals that we would like to make to the balance. It initializes a pool of worker threads to handle these requests and then prints out the final balance.
update_account(): Function passed to threads; keeps updating the balance with the next request from the shared vector until there are no requests left.
make_large_vector(): Returns a long vector of requests.

Tip: We highly recommend also reading over the comments in the code to make sure you follow the structure of the program and the purpose of all the variables/functions. This will make the following tasks much easier.

Task 3.1: Before running the program, identify what the error is in the code. Where does it originate, and what do you predict the output to be? Discuss these questions and record your answers.

Compile and run concurrency.cpp as many times as you need to confirm your hypothesis.

cd 3-concurrency
# compile
make cpp
# run
./concurrency_cpp

You might find the following command useful if you are having a hard time finding an example of the error:

while true; do ./concurrency_cpp; done | uniq

Now, take a look at the Rust program (we recommend having it up alongside the C++ version). The overall structure of the code is the same, however there are a couple of notable types and functions which are worth mentioning in order to make more sense of this program.

The Arc type

Think back to the first program you analyzed (move.c/move.rs), where a seemingly-valid reference was left pointing to a block of memory that had already been freed. You discovered that Rust prevents this by only allowing one "owner" for a given block of memory. When the "owner" goes out of scope or is dropped, the compiler knows that it can free that memory.

This gets complicated when we have multiple threads trying to access shared data. In these cases, it is unclear at compile-time when the data can be deallocated and who is responsible for doing so, because the data's lifetime must persist until every shared owner is done with it. In these cases, we wrap the data with an Arc, which stands for Atomic Reference Counter. References to Arc types are given out using a special method, Arc::clone() and are dynamically counted at runtime. (Counting is done atomically to ensure thread-safety.) This guarantees that the memory is freed only when there are zero references—and thus no owner—left in scope.

Arc<Mutex<T>> (where T is some arbitrary type) is a common pattern in Rust code, because most mutex-protected data needs to be shared across threads.

unwrap()

You may have noticed that a method called unwrap() appears in multiple—seemingly unrelated—places in the program.

Certain functions which have the possibility of failing will return either the computed value upon success, or a special error type upon failure. This has the advantage of letting the caller decide how to handle the return value in each case. Calling .unwrap() on the output of one of these functions handles success and failure in the following manner:

If the function ran successfully and returned the computed value, .unwrap() will just return that value.
If the function errored and returned an error type, .unwrap() will panic and halt the program.

Atomics and Ordering

At a high level, the Ordering parameter tells the compiler what order the CPU is allowed perform atomic operations with respect to each other. For this example, the ordering parameter does not matter and you may ignore it. If you'd like to learn more, however, keep reading!

You may notice that when invoking the methods load and store on an AtomicBool, we also pass a parameter called Ordering::Relaxed into the method.

Normally, a compiler can optimize away certain operations or execute certain instructions in different sequences from the code that the programmer wrote in order to improve performance. While atomic types ensure that any operation on them happens as an unbroken sequence, they do not offer any special guarantees on the order in which they are executed in relation to other atomic instructions.

Sometimes however, we need guarantees on when threads can read and write atomic values. For example, we might want to make sure that a thread does not read an atomic value before another thread has written to it. While we are guaranteed that the individual read and write instructions happen atomically, we may not be guaranteed that one happens before another. The Ordering enum tells the compiler how strict it should be with following the sequence of instructions that are specified in the original source code. This practice was actually inherited from C, which established the various different kinds of ordering that are currently used in Rust's memory model.

Task 3.2: Compile the Rust program (make rust) and observe the compilation error. What is the error you get? How do mutexes differ between Rust and C++, and why is Rust's model safer? Discuss these questions and record your answers.

Now, you will fix both programs, starting with the C++ program. If you find yourself stuck on fixing the Rust example, we recommend looking at the code for hints ;). We have also included a description of some relevant methods and types.

Task 3.3: Fix the error in both programs! (Start with C++.) Once you have done so, check your program's correctness with:

make check

Hints for Rust

Mutex<T> has methods lock() and unlock(), both of which can return either a MutexGuard on success or an error on failure. This means that you'll want to use unwrap() to access the mutex guard.

Dereferencing a MutexGuard will give access to the inner data.

⚠️ Stop here until your section TAs have summarized the concurrency part! ⚠️

This is where the section materials end. You might discuss with your TA, e.g., whether you think Rust helps programmers avoid common errors, and what the trade-offs involved might be (i.e., why might some people still want to program in C/C++ instead?).

Congratulations, you have completed the Rust section!

There is obviously much more to learn about Rust and how it changes systems programming. If you have questions, talk to the course staff and take upper-level courses that let you use Rust for projects: in our experience, nothing makes you learn Rust better than having to write a substantial program in it on your own.

Additional Resources on Rust

If you want to learn more about safer systems programming and Rust, check out some of these resources!

Acknowledgements: This section (originally a lab) was developed for CS 300 by Sreshtaa Rajesh, Paul Biberstein, Livia Zhu, Shihang (Vic) Li, Casey Nelson, and Malte Schwarzkopf.