« Back to the main CS 300 website
Section 9: Safer Systems Programming (featuring Rust)
⚠️ This section is meant to be done in-person with other students. ⚠️
Introduction
By this point in the semester, you've done a significant amount of programming in C/C++ and are likely very familiar with our good friend, GDB. We've debugged issues like unknown segfaults and buffer overwrites and will soon tackle race conditions and deadlocks! A common link between all these errors is that they are a consequence of giving the programmer unfettered access to the underlying memory representation of data, and putting the responsibility on the programmer to manage it properly. This can lead to serious vulnerabilities in widely used software.
In this lab, you will explore some new ways in which programming language designers have tried to make low-level systems programming easier and less error-prone, while maintaining its main advantages: a high degree of control over the memory representation of data and the opportunity for low-level performance optimization.
In particular, we will explore concepts in Rust, a relatively new systems programming language. Rust 1.0 released in 2015, and its language designers had the benefit of 40 years of hindsight compared to C, no need to maintain backwards compatibility, and the opportunity to enshrine in the language some of the ideas that systems programmers had discovered to be useful in the past few decades.
An important disclaimer: We will use Rust to exemplify key concepts in enforcing memory-safety in a systems language, but this lab is not meant to be a Rust tutorial. Some of the ideas you'll encounter in this lab also exist in modern C++ and in other systems languages, such as Go.
If you would like to learn more about Rust, we have linked some resources in the Appendix section at the bottom of this handout!
Assignment
Assignment Installation
First, ensure that your repository has a handout
remote. Type:
$ git remote show handout
If this reports an error, run:
$ git remote add handout
https://github.com/csci0300/cs300-s24-section.git
Then run:
$ git pull
$ git pull handout master
This will merge our Section 7 stencil code with your previous work. If you have any conflicts from previous labs, resolve them before continuing further. Run git push
to save your work back to your personal repository.
Part 1: Basic Memory Safety
In these exercises, you will walk through two programs that address classic errors in memory management, for which we have provided the code in C as well as the analogous code in Rust. You might find some of the Rust syntax new, so it will be helpful to compare the two programs side-by-side so that your knowledge of C code can help you make sense of the Rust code! The Rust and C programs do the exact same thing.
Overview of Lifetimes
As you know well by now, any memory that is allocated during a program must at some point be freed. Think back all the way to our discussions about the different segments of memory (data, stack, heap). One of the distinguishing factors between these segments is how long the memory allocated to those regions persists.
Segment |
Duration |
Method of Freeing |
Data |
Duration of program |
Automatic when program ends |
Stack |
Duration of function call |
Automatic when data goes out of scope |
Heap |
Until freed |
Manual call to free() |
Looking at the second column of that table lets us introduce the concept of lifetimes. The lifetime of a piece of memory is the span of time after it is initialized but before it is freed. For instance:
- Many local variables have a lifetime that's equivalent to the length of a function execution.
- A loop index variable has a lifetime equivalent to the duration of the loop.
- Global data has the same lifetime as the entire program (this is sometimes called the "static" lifetime).
Task 1.1: Among these three types of memory, it is most difficult to determine the lifetime of heap data. Discuss with your group why this is.
Why do we care about the lifetimes of data? As it turns out, most of the memory bugs we worry about in C and C++ can be framed in terms of lifetime violations! You'll explore a couple of such examples in Part 1 of this section. Before doing so, however, we will introduce one more foundational concept that systems programmers use to reason about memory safety: Ownership.
Ownership
One of the foundational concepts of memory safety in Rust is the notion of ownership of memory. The owner of a piece of memory is responsible for making sure that it gets freed when the memory is no longer needed. Each value is owned by a variable, and can only have one such owner at a time. The variable that owns a value determines its lifetime.
Typical memory errors in low-level languages like C/C++ are often caused by mistaken notions of ownership. Consider the situation where there is no owner, or multiple pieces of code that believe they are the owners:
Situation |
Consequence |
Two or more owners |
Use-after free, double free |
No owners |
Memory leak (missing free) |
Over time, systems programmers realized that careful reasoning about ownership is a good way to avoid memory safety issues like the above. Indeed, Google's most recent C++ style guide says:
It's virtually impossible to manage dynamically allocated memory without some sort of ownership logic.
The C++ language consequently acquired optional notions of ownership, the most common of which are either careful documentation of ownership via comments on APIs (e.g., "the consume
function takes ownership of the memory at the data
pointer") or the use of smart pointers, which are special classes (e.g., unique_ptr<int>
and shared_ptr<int>
) that replace "raw" pointers (such as int*
) and encode notions of ownership.
In the following exercise, you will explore Rust's notion of ownership, and specifically, how changes of ownership ("moving") work.
Exercise: What is "Moving"?
For the first set of tasks, you will look at move.c
. Make sure to read the comments in the code so that you are familiar with what the program is supposed to do.
Task 1.2: Before running any of the code, answer the following questions and record your answers:
- What is the memory management error in the C program? Why is it a problem?
- What could be a possible way to prevent this type of error? (There's no incorrect answer here, so feel free to use your imagination.)
Now, from within the 1-move
directory, compile and run the C program using the commands shown below. The first command will compile the executable move_c
, and the second command will run it.
$ make c
$ ./move_c
Now, answer the following question:
- What does the program output? Was this expected, or could you imagine another output?
Memory management errors that compile and run without visible errors lead to undefined behavior being introduced into programs. Our example here was small, but if this code was part of a much bigger program where some other functionality depended on a correct output, then much more significant problems could arise.
Rust developers had the advantage of being able to learn from the mistakes made with C, and thus built the Rust compiler to prevent memory-management errors at compile-time.
Now, pull up move.rs
and compile the program using the command make rust
.
Task 1.3: What happens when you compile the Rust program? Look at the error message and discussion with your group how you think Rust can detect this!
We recommend looking at the "Helpful Background" dropdown for some additional context on this type of memory error.
Helpful Background
This is an example of aliasing. In the C program, we had a pointer to some allocated memory, and then manually set another pointer to point to the same memory. Having multiple references to the same data is dangerous precisely for the reasons you've just seen—if we free some memory while there's still another reference pointing to it, then we now have a seemingly valid reference that's actually pointing to invalid memory. In C, these types of errors can silently compile and run!
⚠️ Stop here until your section TAs have summarized the ownership part! ⚠️
Exercise: Enforcing Lifetimes
You have just explored an example of a use-after-free lifetime violation, where we tried to read from heap-allocated memory whose lifetime had ended upon a call to free()
.
Looking at the above table, we can spot another situation where lifetime violations can cause memory safety issues: returning references to values on the stack. Since values on the stack live for at most the duration of a function call, returning pointers to stack values (like local variables) can cause segfaults when trying to access that memory elsewhere in the program (it's possible you may have experienced this type of problem this semester!).
In this exercise, you'll explore how Rust's ownership guarantees can make reasoning about lifetimes at compile time (i.e., "statically") easier and even allow the compiler to warn you about lifetime errors at compile time!
Now, read through lifetimes.c
until you feel like you have a good sense of what the program is doing. Then, read through lifetimes.rs
. Don't be scared by the syntax, lifetimes.rs
implements the same functionality as lifetimes.c
. If you are struggling to understand what's happening in the Rust program, use the C program as reference. Below, we also explain some of the unfamiliar syntax that you might find in the Rust program.
There are two constructs in the Rust code that may be unfamiliar. They are both in the List
struct definition.
Option
is a type that specifies that a value could be null. Since references in Rust can't be NULL
like they can in C, we use Option
instead. The Rust NULL
equivalent is None
, and its counterpart is Some(...)
. Therefore, if we have something of type Option<List>
, it can either be None
or Some(List {...})
.
- Syntax that looks like
'a
is Rust's way of manually annotating lifetimes. You don't need to worry about it for this example, since we'll focus on a case where the Rust compiler can give you lifetime warnings without any help from the developer. Anytime you see 'a
, feel free to ignore it.
I want to know more about Rust's lifetime annotation syntax
As we discussed before, sometimes reasoning about lifetimes can be complex, and Rust provides a method for the developer to manually specify lifetimes when they are too complicated for the compiler to infer.
The lifetime annotations on the struct are quite simple: first we introduce a lifetime parameter called 'a
("alpha"). This is what struct List<'a>
does. Then, we specify that the reference inside the option needs to have a lifetime of at least 'a
(this is what the syntax &'a List
means). Finally, we declare that the next element in the list is parameterized by the same lifetime 'a
(this is what List<'a>
means).
Now, the Rust compiler can make sure we don't have any lifetime violations with the List
struct. This happens at compile-time, without running the program. What kind of lifetime violations are we worried about? Any time a struct contains a reference, we have to make sure that the struct doesn't outlive the reference, or else we'll have a dangling reference.
To help us, when we create a new List
node, the Rust compiler will examine the lifetime of the reference we use inside and ensure that the List
doesn't outlive that reference!
Task 2.1: Before running any of the code, discuss the following questions and record you answer:
- What is wrong with
lifetimes.c
? What do you expect the output to be?
Now, compile and runlifetimes.c
and lifetimes.rs
.
cd 2-lifetimes
make c
./lifetimes_c
make rust
Task 2.2: Discuss the following questions and record your answer:
- Does the output of the C program match what you expect? If not, how can you explain the difference?
- What is the output of the Rust compiler? How is the Rust compiler able to give the warning it does?
⚠️ Stop here until your section TAs have summarized the lifetimes part! ⚠️
Part 2: Safer Concurrency
Having seen how systems languages can address basic memory bugs by enforcing ownership and lifetimes, we are now ready to look at how language developers try to make concurrency safer!
Exercise: Locking Discipline
In this exercise, you will look at concurrency.cpp
and concurrency.rs
. Similar to before, these examples do the exact same thing, but you might find that the Rust code has some unfamiliar terms. We highly recommend you use your knowledge about C++ to make sense of the Rust code, as well as read all the comments that we've annotated the code with!
First, look at concurrency.cpp
. This program is longer than the programs you've seen before, but it is not complex: it is a money transfer application! Here is a quick overview of the functions and structs you are given:
account_t
: This struct contains two fields, a balance
(meant to represent how much money the account contains), and a balance_mtx
which is the mutex that protects the balance.
main()
: Initializes a single account with a starting balance and a vector of integers that represent deposits/withdrawals that we would like to make to the balance. It initializes a pool of worker threads to handle these requests and then prints out the final balance.
update_account()
: Function passed to threads; keeps updating the balance with the next request from the shared vector until there are no requests left.
make_large_vector()
: Returns a long vector of requests.
Tip: We highly recommend also reading over the comments in the code to make sure you follow the structure of the program and the purpose of all the variables/functions. This will make the following tasks much easier.
Task 3.1: Before running the program, identify what the error is in the code. Where does it originate, and what do you predict the output to be? Discuss these questions and record your answers.
Compile and run concurrency.cpp
as many times as you need to confirm your hypothesis.
cd 3-concurrency
make cpp
./concurrency_cpp
You might find the following command useful if you are having a hard time finding an example of the error:
while true; do ./concurrency_cpp; done | uniq
Now, take a look at the Rust program (we recommend having it up alongside the C++ version). The overall structure of the code is the same, however there are a couple of notable types and functions which are worth mentioning in order to make more sense of this program.
The Arc
type
Think back to the first program you analyzed (move.c/move.rs
), where a seemingly-valid reference was left pointing to a block of memory that had already been freed. You discovered that Rust prevents this by only allowing one "owner" for a given block of memory. When the "owner" goes out of scope or is dropped, the compiler knows that it can free that memory.
This gets complicated when we have multiple threads trying to access shared data. In these cases, it is unclear at compile-time when the data can be deallocated and who is responsible for doing so, because the data's lifetime must persist until every shared owner is done with it. In these cases, we wrap the data with an Arc
, which stands for Atomic Reference Counter. References to Arc
types are given out using a special method, Arc::clone()
and are dynamically counted at runtime. (Counting is done atomically to ensure thread-safety.) This guarantees that the memory is freed only when there are zero references—and thus no owner—left in scope.
Arc<Mutex<T>>
(where T
is some arbitrary type) is a common pattern in Rust code, because most mutex-protected data needs to be shared across threads.
unwrap()
You may have noticed that a method called unwrap()
appears in multiple—seemingly unrelated—places in the program.
Certain functions which have the possibility of failing will return either the computed value upon success, or a special error type upon failure. This has the advantage of letting the caller decide how to handle the return value in each case. Calling .unwrap()
on the output of one of these functions handles success and failure in the following manner:
- If the function ran successfully and returned the computed value,
.unwrap()
will just return that value.
- If the function errored and returned an error type,
.unwrap()
will panic and halt the program.
Atomics and Ordering
At a high level, the Ordering
parameter tells the compiler what order the CPU is allowed perform atomic operations with respect to each other. For this example, the ordering parameter does not matter and you may ignore it. If you'd like to learn more, however, keep reading!
You may notice that when invoking the methods load
and store
on an AtomicBool
, we also pass a parameter called Ordering::Relaxed
into the method.
Normally, a compiler can optimize away certain operations or execute certain instructions in different sequences from the code that the programmer wrote in order to improve performance. While atomic types ensure that any operation on them happens as an unbroken sequence, they do not offer any special guarantees on the order in which they are executed in relation to other atomic instructions.
Sometimes however, we need guarantees on when threads can read and write atomic values. For example, we might want to make sure that a thread does not read an atomic value before another thread has written to it. While we are guaranteed that the individual read and write instructions happen atomically, we may not be guaranteed that one happens before another. The Ordering
enum tells the compiler how strict it should be with following the sequence of instructions that are specified in the original source code. This practice was actually inherited from C, which established the various different kinds of ordering that are currently used in Rust's memory model.
Task 3.2: Compile the Rust program (make rust
) and observe the compilation error. What is the error you get? How do mutexes differ between Rust and C++, and why is Rust's model safer? Discuss these questions and record your answers.
Now, you will fix both programs, starting with the C++ program. If you find yourself stuck on fixing the Rust example, we recommend looking at the code for hints ;). We have also included a description of some relevant methods and types.
Task 3.3: Fix the error in both programs! (Start with C++.) Once you have done so, check your program's correctness with:
make check
Hints for Rust
Mutex<T>
has methods lock()
and unlock()
, both of which can return either a MutexGuard
on success or an error on failure. This means that you'll want to use unwrap()
to access the mutex guard.
Dereferencing a MutexGuard
will give access to the inner data.
⚠️ Stop here until your section TAs have summarized the concurrency part! ⚠️
This is where the section materials end. You might discuss with your TA, e.g., whether you think Rust helps programmers avoid common errors, and what the trade-offs involved might be (i.e., why might some people still want to program in C/C++ instead?).
Congratulations, you have completed the Rust section!
There is obviously much more to learn about Rust and how it changes systems programming. If you have questions, talk to the course staff and take upper-level courses that let you use Rust for projects: in our experience, nothing makes you learn Rust better than having to write a substantial program in it on your own.
Additional Resources on Rust
If you want to learn more about safer systems programming and Rust, check out some of these resources!
Acknowledgements: This section (originally a lab) was developed for CS 300 by Sreshtaa Rajesh, Paul Biberstein, Livia Zhu, Shihang (Vic) Li, Casey Nelson, and Malte Schwarzkopf.
« Back to the main CS 300 website
Section 9: Safer Systems Programming (featuring Rust)
⚠️ This section is meant to be done in-person with other students. ⚠️
Introduction
By this point in the semester, you've done a significant amount of programming in C/C++ and are likely very familiar with our good friend, GDB. We've debugged issues like unknown segfaults and buffer overwrites and will soon tackle race conditions and deadlocks! A common link between all these errors is that they are a consequence of giving the programmer unfettered access to the underlying memory representation of data, and putting the responsibility on the programmer to manage it properly. This can lead to serious vulnerabilities in widely used software.
In this lab, you will explore some new ways in which programming language designers have tried to make low-level systems programming easier and less error-prone, while maintaining its main advantages: a high degree of control over the memory representation of data and the opportunity for low-level performance optimization.
In particular, we will explore concepts in Rust, a relatively new systems programming language. Rust 1.0 released in 2015, and its language designers had the benefit of 40 years of hindsight compared to C, no need to maintain backwards compatibility, and the opportunity to enshrine in the language some of the ideas that systems programmers had discovered to be useful in the past few decades.
An important disclaimer: We will use Rust to exemplify key concepts in enforcing memory-safety in a systems language, but this lab is not meant to be a Rust tutorial. Some of the ideas you'll encounter in this lab also exist in modern C++ and in other systems languages, such as Go.
If you would like to learn more about Rust, we have linked some resources in the Appendix section at the bottom of this handout!
Assignment
Assignment Installation
First, ensure that your repository has a
handout
remote. Type:If this reports an error, run:
Then run:
This will merge our Section 7 stencil code with your previous work. If you have any conflicts from previous labs, resolve them before continuing further. Run
git push
to save your work back to your personal repository.Part 1: Basic Memory Safety
In these exercises, you will walk through two programs that address classic errors in memory management, for which we have provided the code in C as well as the analogous code in Rust. You might find some of the Rust syntax new, so it will be helpful to compare the two programs side-by-side so that your knowledge of C code can help you make sense of the Rust code! The Rust and C programs do the exact same thing.
Overview of Lifetimes
As you know well by now, any memory that is allocated during a program must at some point be freed. Think back all the way to our discussions about the different segments of memory (data, stack, heap). One of the distinguishing factors between these segments is how long the memory allocated to those regions persists.
free()
Looking at the second column of that table lets us introduce the concept of lifetimes. The lifetime of a piece of memory is the span of time after it is initialized but before it is freed. For instance:
Task 1.1: Among these three types of memory, it is most difficult to determine the lifetime of heap data. Discuss with your group why this is.
Why do we care about the lifetimes of data? As it turns out, most of the memory bugs we worry about in C and C++ can be framed in terms of lifetime violations! You'll explore a couple of such examples in Part 1 of this section. Before doing so, however, we will introduce one more foundational concept that systems programmers use to reason about memory safety: Ownership.
Ownership
One of the foundational concepts of memory safety in Rust is the notion of ownership of memory. The owner of a piece of memory is responsible for making sure that it gets freed when the memory is no longer needed. Each value is owned by a variable, and can only have one such owner at a time. The variable that owns a value determines its lifetime.
Typical memory errors in low-level languages like C/C++ are often caused by mistaken notions of ownership. Consider the situation where there is no owner, or multiple pieces of code that believe they are the owners:
Over time, systems programmers realized that careful reasoning about ownership is a good way to avoid memory safety issues like the above. Indeed, Google's most recent C++ style guide says:
The C++ language consequently acquired optional notions of ownership, the most common of which are either careful documentation of ownership via comments on APIs (e.g., "the
consume
function takes ownership of the memory at thedata
pointer") or the use of smart pointers, which are special classes (e.g.,unique_ptr<int>
andshared_ptr<int>
) that replace "raw" pointers (such asint*
) and encode notions of ownership.In the following exercise, you will explore Rust's notion of ownership, and specifically, how changes of ownership ("moving") work.
Exercise: What is "Moving"?
For the first set of tasks, you will look at
move.c
. Make sure to read the comments in the code so that you are familiar with what the program is supposed to do.Task 1.2: Before running any of the code, answer the following questions and record your answers:
Now, from within the
1-move
directory, compile and run the C program using the commands shown below. The first command will compile the executablemove_c
, and the second command will run it.$ make c $ ./move_c
Now, answer the following question:
Memory management errors that compile and run without visible errors lead to undefined behavior being introduced into programs. Our example here was small, but if this code was part of a much bigger program where some other functionality depended on a correct output, then much more significant problems could arise.
Rust developers had the advantage of being able to learn from the mistakes made with C, and thus built the Rust compiler to prevent memory-management errors at compile-time.
Now, pull up
move.rs
and compile the program using the commandmake rust
.Task 1.3: What happens when you compile the Rust program? Look at the error message and discussion with your group how you think Rust can detect this!
We recommend looking at the "Helpful Background" dropdown for some additional context on this type of memory error.
Helpful Background
This is an example of aliasing. In the C program, we had a pointer to some allocated memory, and then manually set another pointer to point to the same memory. Having multiple references to the same data is dangerous precisely for the reasons you've just seen—if we free some memory while there's still another reference pointing to it, then we now have a seemingly valid reference that's actually pointing to invalid memory. In C, these types of errors can silently compile and run!
⚠️ Stop here until your section TAs have summarized the ownership part! ⚠️
Exercise: Enforcing Lifetimes
You have just explored an example of a use-after-free lifetime violation, where we tried to read from heap-allocated memory whose lifetime had ended upon a call to
free()
.Looking at the above table, we can spot another situation where lifetime violations can cause memory safety issues: returning references to values on the stack. Since values on the stack live for at most the duration of a function call, returning pointers to stack values (like local variables) can cause segfaults when trying to access that memory elsewhere in the program (it's possible you may have experienced this type of problem this semester!).
In this exercise, you'll explore how Rust's ownership guarantees can make reasoning about lifetimes at compile time (i.e., "statically") easier and even allow the compiler to warn you about lifetime errors at compile time!
Now, read through
lifetimes.c
until you feel like you have a good sense of what the program is doing. Then, read throughlifetimes.rs
. Don't be scared by the syntax,lifetimes.rs
implements the same functionality aslifetimes.c
. If you are struggling to understand what's happening in the Rust program, use the C program as reference. Below, we also explain some of the unfamiliar syntax that you might find in the Rust program.There are two constructs in the Rust code that may be unfamiliar. They are both in the
List
struct definition.Option
is a type that specifies that a value could be null. Since references in Rust can't beNULL
like they can in C, we useOption
instead. The RustNULL
equivalent isNone
, and its counterpart isSome(...)
. Therefore, if we have something of typeOption<List>
, it can either beNone
orSome(List {...})
.'a
is Rust's way of manually annotating lifetimes. You don't need to worry about it for this example, since we'll focus on a case where the Rust compiler can give you lifetime warnings without any help from the developer. Anytime you see'a
, feel free to ignore it.I want to know more about Rust's lifetime annotation syntax
As we discussed before, sometimes reasoning about lifetimes can be complex, and Rust provides a method for the developer to manually specify lifetimes when they are too complicated for the compiler to infer.
The lifetime annotations on the struct are quite simple: first we introduce a lifetime parameter called
'a
("alpha"). This is whatstruct List<'a>
does. Then, we specify that the reference inside the option needs to have a lifetime of at least'a
(this is what the syntax&'a List
means). Finally, we declare that the next element in the list is parameterized by the same lifetime'a
(this is whatList<'a>
means).Now, the Rust compiler can make sure we don't have any lifetime violations with the
List
struct. This happens at compile-time, without running the program. What kind of lifetime violations are we worried about? Any time a struct contains a reference, we have to make sure that the struct doesn't outlive the reference, or else we'll have a dangling reference.To help us, when we create a new
List
node, the Rust compiler will examine the lifetime of the reference we use inside and ensure that theList
doesn't outlive that reference!Task 2.1: Before running any of the code, discuss the following questions and record you answer:
lifetimes.c
? What do you expect the output to be?Now, compile and run
lifetimes.c
andlifetimes.rs
.cd 2-lifetimes # compile and run the c program make c ./lifetimes_c # compile the Rust program make rust
Task 2.2: Discuss the following questions and record your answer:
⚠️ Stop here until your section TAs have summarized the lifetimes part! ⚠️
Part 2: Safer Concurrency
Having seen how systems languages can address basic memory bugs by enforcing ownership and lifetimes, we are now ready to look at how language developers try to make concurrency safer!
Exercise: Locking Discipline
In this exercise, you will look at
concurrency.cpp
andconcurrency.rs
. Similar to before, these examples do the exact same thing, but you might find that the Rust code has some unfamiliar terms. We highly recommend you use your knowledge about C++ to make sense of the Rust code, as well as read all the comments that we've annotated the code with!First, look at
concurrency.cpp
. This program is longer than the programs you've seen before, but it is not complex: it is a money transfer application! Here is a quick overview of the functions and structs you are given:account_t
: This struct contains two fields, abalance
(meant to represent how much money the account contains), and abalance_mtx
which is the mutex that protects the balance.main()
: Initializes a single account with a starting balance and a vector of integers that represent deposits/withdrawals that we would like to make to the balance. It initializes a pool of worker threads to handle these requests and then prints out the final balance.update_account()
: Function passed to threads; keeps updating the balance with the next request from the shared vector until there are no requests left.make_large_vector()
: Returns a long vector of requests.Tip: We highly recommend also reading over the comments in the code to make sure you follow the structure of the program and the purpose of all the variables/functions. This will make the following tasks much easier.
Task 3.1: Before running the program, identify what the error is in the code. Where does it originate, and what do you predict the output to be? Discuss these questions and record your answers.
Compile and run
concurrency.cpp
as many times as you need to confirm your hypothesis.cd 3-concurrency # compile make cpp # run ./concurrency_cpp
You might find the following command useful if you are having a hard time finding an example of the error:
Now, take a look at the Rust program (we recommend having it up alongside the C++ version). The overall structure of the code is the same, however there are a couple of notable types and functions which are worth mentioning in order to make more sense of this program.
The
Arc
typeThink back to the first program you analyzed (
move.c/move.rs
), where a seemingly-valid reference was left pointing to a block of memory that had already been freed. You discovered that Rust prevents this by only allowing one "owner" for a given block of memory. When the "owner" goes out of scope or is dropped, the compiler knows that it can free that memory.This gets complicated when we have multiple threads trying to access shared data. In these cases, it is unclear at compile-time when the data can be deallocated and who is responsible for doing so, because the data's lifetime must persist until every shared owner is done with it. In these cases, we wrap the data with an
Arc
, which stands for Atomic Reference Counter. References toArc
types are given out using a special method,Arc::clone()
and are dynamically counted at runtime. (Counting is done atomically to ensure thread-safety.) This guarantees that the memory is freed only when there are zero references—and thus no owner—left in scope.Arc<Mutex<T>>
(whereT
is some arbitrary type) is a common pattern in Rust code, because most mutex-protected data needs to be shared across threads.unwrap()
You may have noticed that a method called
unwrap()
appears in multiple—seemingly unrelated—places in the program.Certain functions which have the possibility of failing will return either the computed value upon success, or a special error type upon failure. This has the advantage of letting the caller decide how to handle the return value in each case. Calling
.unwrap()
on the output of one of these functions handles success and failure in the following manner:.unwrap()
will just return that value..unwrap()
will panic and halt the program.Atomics and
Ordering
At a high level, the
Ordering
parameter tells the compiler what order the CPU is allowed perform atomic operations with respect to each other. For this example, the ordering parameter does not matter and you may ignore it. If you'd like to learn more, however, keep reading!You may notice that when invoking the methods
load
andstore
on anAtomicBool
, we also pass a parameter calledOrdering::Relaxed
into the method.Normally, a compiler can optimize away certain operations or execute certain instructions in different sequences from the code that the programmer wrote in order to improve performance. While atomic types ensure that any operation on them happens as an unbroken sequence, they do not offer any special guarantees on the order in which they are executed in relation to other atomic instructions.
Sometimes however, we need guarantees on when threads can read and write atomic values. For example, we might want to make sure that a thread does not read an atomic value before another thread has written to it. While we are guaranteed that the individual read and write instructions happen atomically, we may not be guaranteed that one happens before another. The
Ordering
enum tells the compiler how strict it should be with following the sequence of instructions that are specified in the original source code. This practice was actually inherited from C, which established the various different kinds of ordering that are currently used in Rust's memory model.Task 3.2: Compile the Rust program (
make rust
) and observe the compilation error. What is the error you get? How do mutexes differ between Rust and C++, and why is Rust's model safer? Discuss these questions and record your answers.Now, you will fix both programs, starting with the C++ program. If you find yourself stuck on fixing the Rust example, we recommend looking at the code for hints ;). We have also included a description of some relevant methods and types.
Task 3.3: Fix the error in both programs! (Start with C++.) Once you have done so, check your program's correctness with:
Hints for Rust
Mutex<T>
has methodslock()
andunlock()
, both of which can return either aMutexGuard
on success or an error on failure. This means that you'll want to useunwrap()
to access the mutex guard.Dereferencing a
MutexGuard
will give access to the inner data.⚠️ Stop here until your section TAs have summarized the concurrency part! ⚠️
This is where the section materials end. You might discuss with your TA, e.g., whether you think Rust helps programmers avoid common errors, and what the trade-offs involved might be (i.e., why might some people still want to program in C/C++ instead?).
Congratulations, you have completed the Rust section!
There is obviously much more to learn about Rust and how it changes systems programming. If you have questions, talk to the course staff and take upper-level courses that let you use Rust for projects: in our experience, nothing makes you learn Rust better than having to write a substantial program in it on your own.
Additional Resources on Rust
If you want to learn more about safer systems programming and Rust, check out some of these resources!
Acknowledgements: This section (originally a lab) was developed for CS 300 by Sreshtaa Rajesh, Paul Biberstein, Livia Zhu, Shihang (Vic) Li, Casey Nelson, and Malte Schwarzkopf.