« Back to the main CS 300 website
Lab 4: Getting Familiar with WeensyOS
Due Tuesday, April 2nd at 8:00 PM EST
Introduction
In this lab, we will introduce you to kernel programming in WeensyOS. We will the give you a high-level overview of virtual memory, and walk you through "memory iterators", which you'll use in Project 4. Your tasks will involve modifying the WeensyOS kernel to start some processes on bootup, making those processes allocate memory on the heap, and then making a copy of the kernel memory mappings. This stencil is very similar to the stencil you'll use for Project 4.
Assignment
You must run this assignment in your Docker container.
Assignment installation
First, ensure that your repository has a handout
remote. Type:
$ git remote show handout
If this reports an error, run:
$ git remote add handout https://github.com/csci0300/cs300-s24-labs.git
Then run:
$ git pull
$ git pull handout main
This will merge our Lab 4 stencil code with your previous work. If you have any "conflicts" from Lab 3, resolve them before continuing further. Run git push
to save your work back to your personal repository.
Starting up WeensyOS
WeensyOS can, in principle, run on any computer with an x86-64 CPU architecture (like yours, most likely). For this lab, however, you will run the OS in QEMU, a CPU emulator. QEMU makes it possible to easily restart and debug WeensyOS, and you will use it in Project 4 as well.
Task: Run WeensyOS! To do so, run:
make run
in the lab4
directory of your lab repository.
- OR
make run-console
if QEMU’s default display causes accessibility problems.
You should see a screen similar to the one below, which indicates that the WeensyOS kernel has booted on the computer simulated by QEMU.
Nothing much is happening here! To exit, press Ctrl-C.
Now, let's look at some code. Since WeensyOS is a complete operating system, the stencil code for this lab is quite large – but rest assured, you will only need to look at a few files for now.
Open kernel.cc
in your editor. This file contains the implementation of the WeensyOS kernel. Find the function void kernel(const char* command)
. This function is what runs once the bootloader hands over control to the WeensyOS kernel. At the bottom of the function, you will find the message you saw on the screen.
Optional: How does kernel()
get called?
If you're curious how the boot process works, take a look at boot.cc
and read the comment at the top of the file. You don't have to understand the details of the bootup process for either this lab or the WeensyOS project.
In a nutshell:
- When the computer turns on, it runs the code in
bootentry.S
, which eventually calls the void boot()
function in boot.cc
.
- The bootloader reads the kernel binary (in Executable and Linkable (ELF) format) from disk into memory.
- It then jumps to the
kernel_entry
label in the assembly code in k-exception.S
.
- At the end of that short bit of assembly code, the
jmp _Z6kernelPKc
instruction jumps to the location at which the compiler placed the kernel
function (_Z6kernelPKc
is the mangled symbol name for kernel
).
Starting Processes in WeensyOS
Now let's start some processes! Unlike the OS on your computer, the basic WeensyOS we're using for this lab does not allow the user to specify what program they would like to run. Instead, it will run a fixed set of programs at bootup.
Take a look at process_setup
in kernel.cc
:
void process_setup(pid_t pid, const char* program_name)
This function is responsible for setting up kernel state for a new userspace process: for example, it loads the program (via the program_loader
, an ELF binary loader we're providing you with). It then sets the instruction pointer and stack pointers, and finally marks the new process as runnable. Read through it now; you will need to understand and modify this function in Project 4.
Task: In the kernel
function in kernel.cc
, we'll call process_setup
to set up a new process:
- Remove the
console_printf
call in the kernel
function, and instead add a call to process_setup
with these parameters:
pid_t pid
: You will need to pass a process ID (PID). PIDs for user processes will start at 1
, as the kernel is PID 0
.
const char* program_name
: Use "allocator"
as your program_name. The WeensyOS programmer loader recognizes a specific set of program names. For example, when asked to load "allocator"
into memory, the program loader will load compiled source code based on "p-allocator.cc"
into a specific address of physical memory.
- Finally, you need to tell the kernel to actually run a process and jump to user mode. For that, call
run(&ptable[1])
, which starts PID 1.
Once you complete this step and run your WeensyOS, you should see a screen similar to the one below.
Whoa ! WeensyOS is displaying some helpful information to you here. Let's look at what we've got. The output visualizes the contents of the machine's physical memory (i.e., RAM).
- The addresses listed on the left are the physical memory addresses of the leftmost character on each line.
- Each character represents a 4 KB region of memory (a "page" of physical memory), and its color and letter indicate what the memory is used for.
The picture below explains the meaning of different characters. Your current WeensyOS will only have characters for the (red) process 1.
Looking more closely at the regions of memory that are in use here, the second picture explains what they're for. You'll see that each process has code and data pages, which the program loader puts in memory, and a stack (at the far end of its row), which process_setup
put in place.
Now that you understand the output, let's have some more processes.
Task: Create three more processes in the kernel
function. Use program_names "allocator2"
, "allocator3"
, and "allocator4"
when you start them. This makes the program loader load the same compiled source code, p-allocator.cc
, but into different regions of memory.
Note: you can leave the call to run(&ptable[1])
unmodified. It just asks the kernel to enter the kernel process scheduler and try to schedule process 1. If process 1 isn't runnable or yields back, the kernel will automatically pick another process to run.
Once you've completed this step, you should see the following output on the screen.
It looks like the processes are running, but they're not actively using any heap memory. The code for your processes is in p-allocator.cc
:
p-allocator.cc
This code runs in user space (i.e., outside the kernel). Right now, each process figures out where its heap and stack are, and then loops doing nothing. We can make the processes allocate some memory using the sys_page_alloc
system call.
The signature and specification of sys_page_alloc
is in u-lib.hh
. Go and read it.
int sys_page_alloc(void* addr)
The implementation simply invokes make_syscall
, which executes the inline assembly code in the same file. You don't need to understand this code in detail, but note that it invokes the syscall
x86-64 assembly instruction. This instruction transfers control into the kernel.
Optional: How does the WeensyOS kernel handle the system call?
The syscall
instruction raises a trap, which is ultimately handled by uintptr_t syscall(regstate* regs)
in kernel.cc
. The switch
statement in that function figures out what system call was made (you can see several others there!), and then calls syscall_page_alloc
.
syscall_page_alloc
is implemented in kernel.cc
, and it's different from the user-space function sys_page_alloc
(which we saw in u-lib.hh
). syscall_page_alloc
has a very basic implementation in this lab, which you'll modify in the project!
Task: Change p-allocator.cc
to loop allocating successive heap pages via sys_page_alloc
, as specified in process_main
.
Aside: Be sure to not delete the following three lines of code at the end of process_main
:
while(true) {
sys_yield();
}
These lines are critical since they ensure that process_main
returns and exits in a controlled manner.
"Stopping" a program is actually something that needs to be implemented, it doesn't happen automatically. This functionality is normally implemented via the exit
system call, which gets called implicitly whenmain()
returns in normal C programs and lets the kernel take control and clean up the process's resources.
But, in WeensyOS, exit
is not implemented, so when process_main
returns, the processor just executes the next location in memory as a machine instruction. Therefore, we need the continual calls to sys_yield
in order to allow the WeensyOS kernel to take over properly.
:::
If you run WeensyOS after completing this change, you should see your four processes rapidly fill up the memory from left to right, with the end result shown below.
If you instead see WeensyOS crash with a scary error similar to the picture below, think about when the loop allocating pages needs to stop, and what could go wrong if it does not!
Now, once you've got this right, let's figure out why some numbers are white. The processes asked for memory and the kernel allocated it, but they haven't actually done anything with that memory. The white numbers indicate that there is no data in the first four bytes of the page (they are all zeros, as the kernel zeros the page when allocating it).
Task: Make your p-allocator.cc
write the PID of the current process into the first four bytes of each page it allocates.
Hint:
Any physical page is directly accessible to userspace in this version of WeensyOS. Consider dereferencing a pointer to write to a page.
Now your output should be all colored:
So far, the console is only showing you the physical memory. In the lab version of WeensyOS – as well as in the initial stencil for Project 4 – there is no isolation between processes. In other words, process 1 could totally trample all over the memory of the other processes! In Project 4, you'll implement virtual memory to support better process isolation.
Virtual Memory
Virtual memory is the hardware mechanism that allows multiple processes on a computer to safely share physical memory. "Safely" here means that processes should not be able to write all over other processes' memory – and that includes the kernel's memory, as the kernel is also a process (with PID 0 in WeensyOS). You will have learned (or are soon about to learn) about virtual memory in lectures!
Need a refresher on virtual memory? Expand to read the details.
The kernel –- which needs complete privilege over all operations of the machine, and must be protected at all costs –- must run without inappropriate interference from user-level processes, which have no privilege and might have bugs.
The kernel's instructions and data are stored in the computer's physical memory. We need some way to partition the physical memory so that some parts of it are inaccessible to unprivileged processes. Different operating systems have different memory layouts, but for concreteness, let’s look at a specific layout based on the one from WeensyOS.
Here’s how “physical memory” might be laid out:
+----------------------------------------------------------------------------+
| | Kernel data | | Unprivileged data | |
+----------------------------------------------------------------------------+
^ ^
0x40000 0x100000
How can we protect the kernel's data from access by unprivileged processes? One natural solution is segmentation. That is, the kernel program can access any kind of address, whereas user programs can only access a specific range of address. If the user program violates this rule, then it needs to give control back to the kernel using fault (this is where segmentation fault comes from!).
However, the problem with segmentation is its inflexibility. For example, suppose that process P1 wants to allocate more heap memory from the operating system. If this is the current memory map:
+----------------------------------------------------------------------------+
|..........| K |...............| P1 |......................|
+----------------------------------------------------------------------------+
^ ^
0x40000 0x100000
where the dots are free, then it’s safe to allocate more memory for P1:
+----------------------------------------------------------------------------+
|..........| K |...............| P1 |.................|
+----------------------------------------------------------------------------+
^ ^
0x40000 0x100000
But what if another program, P2, was allocated close by?
+----------------------------------------------------------------------------+
|..........| K |...............| P1 |..| P2 |........|
+----------------------------------------------------------------------------+
^ ^
0x40000 0x100000
Of course, P2’s data should be inaccessible to P1, and vice versa. Unfortunately, if we want to give each process a contiguous block of memory, this means that now P1’s ability to grow is limited by the nearby placement of P2.
We therefore need a new layer of indirection, and this layer is called virtual memory. The purpose of the layer is to divorce the addresses visible to process instructions from the addresses in physical memory. We want a flexible mapping between the addresses visible to processes, which we call virtual addresses, and the addresses corresponding to physical memory chips. Since different processes have different permissions to access memory, they must have different virtual address spaces.
For example, here’s a view of memory with two processes P1 and P2 (where P1 contains some data "AB"):
0x90000
v
+--------------------------------------+
|........| P1 "AB" |..................|
+--------------------------------------+
\
\ 0x90000
\ v
\ +-----------------------+
\ |.......| P2 |..........|
\ +-----------------------+
\ /
\ /
\ /
\ /
+-------------------------------------------------------------------+
|....| K |..................| P1 "AB" |....| P2 |...............|
+-------------------------------------------------------------------+
^ ^
0x100000 0x140000
Note that the addresses in these different address spaces can differ from physical addresses, and in fact, the same virtual address, when accessed in different processes, can refer to different physical memory. Now, if P1 wants to borrow more memory for some new data "CD", it doesn’t matter that P2 is restricting its growth in contiguous physical memory. The kernel can allocate P1 new physical memory from anywhere, but make it appear contiguous to P1 by arranging mappings carefully:
0x90000 0x120000
v v
+--------------------------------------+
|........| P1 "AB" "CD" |.............|
+--------------------------------------+
\ /
\ / 0x90000
\ / v
\ / +-----------------------+
\/ |.......| P2 |..........|
/\ +-----------------------+
/ \ /
/ \ /
/ \ /
/ \ /
+-------------------------------------------------------------------+
|....| K |.......|"CD"|......| P1 "AB" |....| P2 |..............|
+-------------------------------------------------------------------+
^ ^ ^
0x70000 0x100000 0x140000
Let's enable displaying virtual memory on the console.
Task: In kernel.cc
, edit the memshow
function and change the value of show_virtual
to 1 to display the virtual address space. Then run your WeensyOS.
You'll now see a picture like the following (though your processes might allocate memory faster).
Our current WeensyOS has no virtual memory at all. The bottom part of the console output cycles through the virtual memory view for your four processes in turn. Evidently, all processes see identical views of memory, and that view is exactly what's in physical memory. Moreover, the characters in the virtual memory display are inverted, which indicates that the process is allowed to access the corresponding address. This implies that memory is not properly isolated: for example, all four processes can access the kernel memory (K), and process 1 can access process 2's data (2)!
In Project 4, you will add memory isolation via virtual memory. To do so, you will give processes their own page tables, which translate virtual addresses (which the user-space processes use) into physical addresses (in RAM).
Page Tables
We know how important and useful virtual memory is –- but how do we actually translate a given virtual address to the underlying physical address? This is where page tables come in. A page table is a tree-like data structure consisting of pages that store mappings between virtual addresses and physical addresses. When a process tries to access a virtual address, the CPU looks up the real, physical address in the page table and translates the address that actually gets accessed.
Many processors use multi-level page tables to perform translations. Specifically, x86-64 and WeensyOS have a four-level page table. Recall that each physical page is 212 (4096) bytes in size (= 4KB), and since each address is 64 bits (= 23 bytes), each page can contain 29 addresses.
In a four-level page table, the top-level (level 4) page table contains addresses of level 3 page tables. Each level 3 page table contains addresses of level 2 page tables, each level 2 page table contains addresses of level 1 page tables, and lastly, each level 1 page table contains actual physical addresses.
We won't go into the specifics of how exactly virtual address translation is performed here, and you do not need to know them for this lab and the project. But if you're curious, definitely refer to lecture materials, research on your own, or ask the TAs!
Memory Iterators
WeensyOS abstracts the details of x86-64 page tables away from you, and instead gives you high-level interfaces for dealing with virtual memory and processing page tables: vmiter
and ptiter
, which are C++ iterators that allow you to conveniently loop over virtual memory mappings and page tables. The vmiter
helps to iterate through pages in virtual memory, and lets you check the underlying, destination physical pages for properties like permissions and physical addresses. On the other hand, the ptiter
helps you iterate through page table pages (yes, page tables are themselves stored in pages – it's pages all the way down!).
We picked out some useful methods for vmiter
to walk through together below (to see the full declaration and implementation of the interfaces, see k-vmiter.hh
and k-vmiter.cc
). There's a couple of methods not mentioned here, but feel free to read about them on your own and use them wherever you see fit.
vmiter
class vmiter {
public:
inline vmiter(x86_64_pagetable* pt, uintptr_t va = 0);
inline vmiter(const proc* p, uintptr_t va = 0);
inline uintptr_t va() const;
inline uint64_t pa() const;
inline uint64_t perm() const;
inline bool present() const;
inline bool writable() const;
inline bool user() const;
inline vmiter& operator+=(intptr_t delta);
void map(uintptr_t pa, int perm);
int try_map(uintptr_t pa, int perm);
};
You can initialize a vmiter
object by passing in either a pointer to a process's kernel metadata (in which case it will figure out the page table for that process), or a pointer to an x86-64 page table. Here is an example that checks if the address 0x040000
is mapped in the kernel page tables (you can try this example somewhere in your kernel.cc
!):
x86_64_pagetable* pt = kernel_pagetable;
uintptr_t pa = vmiter(pt, (uintptr_t)0x040000).pa();
assert(pa != (uintptr_t)-1);
The vmiter(pt, (uintptr_t)0x040000)
part creates a vmiter
object that examines the virtual address 0x040000
in the page table pt
. The pa()
method on this object returns the corresponding physical address pa
.
The vmiter
can also find out the permission associated with the current page it is looking at. In x86-64.h
, the following permission flags (bits) are defined:
PTE_P
should be set if a page is present in a page table
PTE_W
should be set if a page is writable
PTE_U
should be set if a page is user-accessible
The vmiter
can then use the following functions to determine the permissions of the current page:
present()
returns if the current virtual address is present (i.e., if the PTE_P
flag is set)
writable()
returns if the current virtual address is present and writable (i.e., if the PTE_P | PTE_W
flag is set)
user()
returns if the current virtual address is present and user-accessible (i.e., if the PTE_P | PTE_U
flag is set)
For example, this snippet of code checks if the va
is present and writable (PTE_P | PTE_W)
in pt
:
if (vmiter(pt, va).writable()) {
...
}
You can also use vmiter
with a for
loop, calling methods that query its state and methods that change its current virtual address. For example, this loop prints all present mappings in the lower 64KiB of memory:
for (vmiter it(pt, 0); it.va() < 0x10000; it += PAGESIZE) {
if (it.present()) {
log_printf("%p maps to %p\n", it.va(), it.pa());
}
}
Note that this loop iterates one page at a time: the it += PAGESIZE
expression increases it.va()
by PAGESIZE
.
The map()
and try_map()
methods are used to add mappings to a page table. They also allocate actual page table pages on demand (e.g., the second and third level page tables mentioned above). The key difference between try_map
and map
is that try_map
returns -1 on failure, while map
panics on failure. In Project 4, it's up to you to decide which is appropriate to call! The following snippet maps virtual address 0x2000
to the physical address 0x3000
with all permissions in the page table pointed to by pt
:
int r = vmiter(pt, 0x2000).try_map(0x3000, PTE_P | PTE_W | PTE_U);
Task:
Now that you're more familiar with how vmiter
works, use vmiter
to implement a function that copies virtual memory mappings from a source page table to a destination page table.
Write this function inside kernel.cc
in your WeensyOS codebase, and use the global kernel_pagetable
as your source, and global kernel_pagetable_copy
as your destination. Call copy_mappings
somewhere in the kernel
function after the kernel_pagetable
is initialized.
Your copy_mappings
function must also print information about the mapped pages using log_printf
, as shown below. This gets you familiar with log_printf
, which you may use for debugging in WeensyOS (it writes to a file called log.txt
in your lab4
directory), and we will use the output to check off your lab.
void copy_mappings(x86_64_pagetable* dst, x86_64_pagetable* src) {
}
Hint:
In order to access the specific bits of the permissions, recall the PTE_P, PTE_W, and PTE_U flags. Refer to the section "Hint: More info on bitwise operators" in the Project 1: Snake handout for a reminder on bitwise operators.
When is copying mappings from one page table to another useful?
When you start a new process with separate page tables for memory isolation (you'll do this in Project 4!), you will want to give it a copy of the kernel's or parent process's page table by copying that page table into a new one. This is also important when you implement the fork
system call in Project 4 (however, the copying process will be more involved than the above function)!
Handin instructions
Turn in your code by pushing your git repository to github.com/csci0300/cs300-s24-labs-YOURUSERNAME.git
.
Then, head to the grading server. On the "Labs" page, use the "Lab 4 checkoff" button to check off your lab.
Note that the checkoff for this lab runs for almost a minute without producing output, as it runs your WeensyOS in the background. Don't be alarmed if the checkoff script seems "stuck" – it will complete after about a minute and show you the results.
Acknowledgements: WeensyOS was originally developed for Harvard's CS 61 course. This lab was developed for CS 300.
« Back to the main CS 300 website
Lab 4: Getting Familiar with WeensyOS
Due Tuesday, April 2nd at 8:00 PM EST
Introduction
In this lab, we will introduce you to kernel programming in WeensyOS. We will the give you a high-level overview of virtual memory, and walk you through "memory iterators", which you'll use in Project 4. Your tasks will involve modifying the WeensyOS kernel to start some processes on bootup, making those processes allocate memory on the heap, and then making a copy of the kernel memory mappings. This stencil is very similar to the stencil you'll use for Project 4.
Assignment
You must run this assignment in your Docker container.
Assignment installation
First, ensure that your repository has a
handout
remote. Type:If this reports an error, run:
Then run:
This will merge our Lab 4 stencil code with your previous work. If you have any "conflicts" from Lab 3, resolve them before continuing further. Run
git push
to save your work back to your personal repository.Starting up WeensyOS
WeensyOS can, in principle, run on any computer with an x86-64 CPU architecture (like yours, most likely). For this lab, however, you will run the OS in QEMU, a CPU emulator. QEMU makes it possible to easily restart and debug WeensyOS, and you will use it in Project 4 as well.
Task: Run WeensyOS! To do so, run:
make run
in thelab4
directory of your lab repository.make run-console
if QEMU’s default display causes accessibility problems.You should see a screen similar to the one below, which indicates that the WeensyOS kernel has booted on the computer simulated by QEMU.
Nothing much is happening here! To exit, press Ctrl-C.
Now, let's look at some code. Since WeensyOS is a complete operating system, the stencil code for this lab is quite large – but rest assured, you will only need to look at a few files for now.
Open
kernel.cc
in your editor. This file contains the implementation of the WeensyOS kernel. Find the functionvoid kernel(const char* command)
. This function is what runs once the bootloader hands over control to the WeensyOS kernel. At the bottom of the function, you will find the message you saw on the screen.Optional: How does
kernel()
get called?Starting Processes in WeensyOS
Now let's start some processes! Unlike the OS on your computer, the basic WeensyOS we're using for this lab does not allow the user to specify what program they would like to run. Instead, it will run a fixed set of programs at bootup.
Take a look at
process_setup
inkernel.cc
:Task: In the
kernel
function inkernel.cc
, we'll callprocess_setup
to set up a new process:console_printf
call in thekernel
function, and instead add a call toprocess_setup
with these parameters:pid_t pid
: You will need to pass a process ID (PID). PIDs for user processes will start at1
, as the kernel is PID0
.const char* program_name
: Use"allocator"
as your program_name. The WeensyOS programmer loader recognizes a specific set of program names. For example, when asked to load"allocator"
into memory, the program loader will load compiled source code based on"p-allocator.cc"
into a specific address of physical memory.run(&ptable[1])
, which starts PID 1.Once you complete this step and run your WeensyOS, you should see a screen similar to the one below.
Whoa ! WeensyOS is displaying some helpful information to you here. Let's look at what we've got. The output visualizes the contents of the machine's physical memory (i.e., RAM).
The picture below explains the meaning of different characters. Your current WeensyOS will only have characters for the (red) process 1.
Looking more closely at the regions of memory that are in use here, the second picture explains what they're for. You'll see that each process has code and data pages, which the program loader puts in memory, and a stack (at the far end of its row), which
process_setup
put in place.Now that you understand the output, let's have some more processes.
Task: Create three more processes in the
kernel
function. Use program_names"allocator2"
,"allocator3"
, and"allocator4"
when you start them. This makes the program loader load the same compiled source code,p-allocator.cc
, but into different regions of memory.Note: you can leave the call to
run(&ptable[1])
unmodified. It just asks the kernel to enter the kernel process scheduler and try to schedule process 1. If process 1 isn't runnable or yields back, the kernel will automatically pick another process to run.Once you've completed this step, you should see the following output on the screen.
It looks like the processes are running, but they're not actively using any heap memory. The code for your processes is in
p-allocator.cc
:The signature and specification of
sys_page_alloc
is inu-lib.hh
. Go and read it.Optional: How does the WeensyOS kernel handle the system call?
Task: Change
p-allocator.cc
to loop allocating successive heap pages viasys_page_alloc
, as specified inprocess_main
.Aside: Be sure to not delete the following three lines of code at the end of
process_main
:These lines are critical since they ensure that
process_main
returns and exits in a controlled manner."Stopping" a program is actually something that needs to be implemented, it doesn't happen automatically. This functionality is normally implemented via the
exit
system call, which gets called implicitly whenmain()
returns in normal C programs and lets the kernel take control and clean up the process's resources.But, in WeensyOS,
exit
is not implemented, so whenprocess_main
returns, the processor just executes the next location in memory as a machine instruction. Therefore, we need the continual calls tosys_yield
in order to allow the WeensyOS kernel to take over properly.:::
If you run WeensyOS after completing this change, you should see your four processes rapidly fill up the memory from left to right, with the end result shown below.
If you instead see WeensyOS crash with a scary error similar to the picture below, think about when the loop allocating pages needs to stop, and what could go wrong if it does not!
Now, once you've got this right, let's figure out why some numbers are white. The processes asked for memory and the kernel allocated it, but they haven't actually done anything with that memory. The white numbers indicate that there is no data in the first four bytes of the page (they are all zeros, as the kernel zeros the page when allocating it).
Task: Make your
p-allocator.cc
write the PID of the current process into the first four bytes of each page it allocates.Hint:
Any physical page is directly accessible to userspace in this version of WeensyOS. Consider dereferencing a pointer to write to a page.
Now your output should be all colored:
So far, the console is only showing you the physical memory. In the lab version of WeensyOS – as well as in the initial stencil for Project 4 – there is no isolation between processes. In other words, process 1 could totally trample all over the memory of the other processes! In Project 4, you'll implement virtual memory to support better process isolation.
Virtual Memory
Virtual memory is the hardware mechanism that allows multiple processes on a computer to safely share physical memory. "Safely" here means that processes should not be able to write all over other processes' memory – and that includes the kernel's memory, as the kernel is also a process (with PID 0 in WeensyOS). You will have learned (or are soon about to learn) about virtual memory in lectures!
Need a refresher on virtual memory? Expand to read the details.
The kernel –- which needs complete privilege over all operations of the machine, and must be protected at all costs –- must run without inappropriate interference from user-level processes, which have no privilege and might have bugs.
The kernel's instructions and data are stored in the computer's physical memory. We need some way to partition the physical memory so that some parts of it are inaccessible to unprivileged processes. Different operating systems have different memory layouts, but for concreteness, let’s look at a specific layout based on the one from WeensyOS.
Here’s how “physical memory” might be laid out:
How can we protect the kernel's data from access by unprivileged processes? One natural solution is segmentation. That is, the kernel program can access any kind of address, whereas user programs can only access a specific range of address. If the user program violates this rule, then it needs to give control back to the kernel using fault (this is where segmentation fault comes from!).
However, the problem with segmentation is its inflexibility. For example, suppose that process P1 wants to allocate more heap memory from the operating system. If this is the current memory map:
where the dots are free, then it’s safe to allocate more memory for P1:
But what if another program, P2, was allocated close by?
Of course, P2’s data should be inaccessible to P1, and vice versa. Unfortunately, if we want to give each process a contiguous block of memory, this means that now P1’s ability to grow is limited by the nearby placement of P2.
We therefore need a new layer of indirection, and this layer is called virtual memory. The purpose of the layer is to divorce the addresses visible to process instructions from the addresses in physical memory. We want a flexible mapping between the addresses visible to processes, which we call virtual addresses, and the addresses corresponding to physical memory chips. Since different processes have different permissions to access memory, they must have different virtual address spaces.
For example, here’s a view of memory with two processes P1 and P2 (where P1 contains some data "AB"):
Note that the addresses in these different address spaces can differ from physical addresses, and in fact, the same virtual address, when accessed in different processes, can refer to different physical memory. Now, if P1 wants to borrow more memory for some new data "CD", it doesn’t matter that P2 is restricting its growth in contiguous physical memory. The kernel can allocate P1 new physical memory from anywhere, but make it appear contiguous to P1 by arranging mappings carefully:
Let's enable displaying virtual memory on the console.
Task: In
kernel.cc
, edit thememshow
function and change the value ofshow_virtual
to 1 to display the virtual address space. Then run your WeensyOS.You'll now see a picture like the following (though your processes might allocate memory faster).
Our current WeensyOS has no virtual memory at all. The bottom part of the console output cycles through the virtual memory view for your four processes in turn. Evidently, all processes see identical views of memory, and that view is exactly what's in physical memory. Moreover, the characters in the virtual memory display are inverted, which indicates that the process is allowed to access the corresponding address. This implies that memory is not properly isolated: for example, all four processes can access the kernel memory (K), and process 1 can access process 2's data (2)!
In Project 4, you will add memory isolation via virtual memory. To do so, you will give processes their own page tables, which translate virtual addresses (which the user-space processes use) into physical addresses (in RAM).
Page Tables
We know how important and useful virtual memory is –- but how do we actually translate a given virtual address to the underlying physical address? This is where page tables come in. A page table is a tree-like data structure consisting of pages that store mappings between virtual addresses and physical addresses. When a process tries to access a virtual address, the CPU looks up the real, physical address in the page table and translates the address that actually gets accessed.
Many processors use multi-level page tables to perform translations. Specifically, x86-64 and WeensyOS have a four-level page table. Recall that each physical page is 212 (4096) bytes in size (= 4KB), and since each address is 64 bits (= 23 bytes), each page can contain 29 addresses.
In a four-level page table, the top-level (level 4) page table contains addresses of level 3 page tables. Each level 3 page table contains addresses of level 2 page tables, each level 2 page table contains addresses of level 1 page tables, and lastly, each level 1 page table contains actual physical addresses.
We won't go into the specifics of how exactly virtual address translation is performed here, and you do not need to know them for this lab and the project. But if you're curious, definitely refer to lecture materials, research on your own, or ask the TAs!
Memory Iterators
WeensyOS abstracts the details of x86-64 page tables away from you, and instead gives you high-level interfaces for dealing with virtual memory and processing page tables:
vmiter
andptiter
, which are C++ iterators that allow you to conveniently loop over virtual memory mappings and page tables. Thevmiter
helps to iterate through pages in virtual memory, and lets you check the underlying, destination physical pages for properties like permissions and physical addresses. On the other hand, theptiter
helps you iterate through page table pages (yes, page tables are themselves stored in pages – it's pages all the way down!).We picked out some useful methods for
vmiter
to walk through together below (to see the full declaration and implementation of the interfaces, seek-vmiter.hh
andk-vmiter.cc
). There's a couple of methods not mentioned here, but feel free to read about them on your own and use them wherever you see fit.vmiter
You can initialize a
vmiter
object by passing in either a pointer to a process's kernel metadata (in which case it will figure out the page table for that process), or a pointer to an x86-64 page table. Here is an example that checks if the address0x040000
is mapped in the kernel page tables (you can try this example somewhere in yourkernel.cc
!):The
vmiter(pt, (uintptr_t)0x040000)
part creates avmiter
object that examines the virtual address0x040000
in the page tablept
. Thepa()
method on this object returns the corresponding physical addresspa
.The
vmiter
can also find out the permission associated with the current page it is looking at. Inx86-64.h
, the following permission flags (bits) are defined:PTE_P
should be set if a page is present in a page tablePTE_W
should be set if a page is writablePTE_U
should be set if a page is user-accessibleThe
vmiter
can then use the following functions to determine the permissions of the current page:present()
returns if the current virtual address is present (i.e., if thePTE_P
flag is set)writable()
returns if the current virtual address is present and writable (i.e., if thePTE_P | PTE_W
flag is set)user()
returns if the current virtual address is present and user-accessible (i.e., if thePTE_P | PTE_U
flag is set)For example, this snippet of code checks if the
va
is present and writable(PTE_P | PTE_W)
inpt
:You can also use
vmiter
with afor
loop, calling methods that query its state and methods that change its current virtual address. For example, this loop prints all present mappings in the lower 64KiB of memory:Note that this loop iterates one page at a time: the
it += PAGESIZE
expression increasesit.va()
byPAGESIZE
.The
map()
andtry_map()
methods are used to add mappings to a page table. They also allocate actual page table pages on demand (e.g., the second and third level page tables mentioned above). The key difference betweentry_map
andmap
is thattry_map
returns -1 on failure, whilemap
panics on failure. In Project 4, it's up to you to decide which is appropriate to call! The following snippet maps virtual address0x2000
to the physical address0x3000
with all permissions in the page table pointed to bypt
:Task:
Now that you're more familiar with how
vmiter
works, usevmiter
to implement a function that copies virtual memory mappings from a source page table to a destination page table.Write this function inside
kernel.cc
in your WeensyOS codebase, and use the globalkernel_pagetable
as your source, and globalkernel_pagetable_copy
as your destination. Callcopy_mappings
somewhere in thekernel
function after thekernel_pagetable
is initialized.Your
copy_mappings
function must also print information about the mapped pages usinglog_printf
, as shown below. This gets you familiar withlog_printf
, which you may use for debugging in WeensyOS (it writes to a file calledlog.txt
in yourlab4
directory), and we will use the output to check off your lab.Hint:
When is copying mappings from one page table to another useful?
Handin instructions
Turn in your code by pushing your git repository to
github.com/csci0300/cs300-s24-labs-YOURUSERNAME.git
.Then, head to the grading server. On the "Labs" page, use the "Lab 4 checkoff" button to check off your lab.
Note that the checkoff for this lab runs for almost a minute without producing output, as it runs your WeensyOS in the background. Don't be alarmed if the checkoff script seems "stuck" – it will complete after about a minute and show you the results.
Acknowledgements: WeensyOS was originally developed for Harvard's CS 61 course. This lab was developed for CS 300.