4 views
<style> details { margin-top: 0.5em; margin-bottom: 0.5em; } summary { //font-weight: bolder; } summary:hover { text-decoration: underline; } blockquote { font-size: 16px; } .todo { color: #ff00ff; border: 2px dashed #ff00ff; padding: 0em 1em; border-radius: 5px; margin-top: 1em; margin-bottom: 1em; //display: none; // UNCOMMENT TO HIDE TODOs } </style> **[&laquo; Back to the main CS 300 website](https://csci0300.github.io/)** # Lab 4: Getting Familiar with WeensyOS ### Due Tuesday, April 1 at 8:00 PM EST --- # Introduction In this lab, we will introduce you to kernel programming in WeensyOS, which is a small, but fully-functional teaching operating system, that you'll work with in our next project. We will the give you a high-level overview of virtual memory, and walk you through "memory iterators", which you'll use in Project 4. Your tasks will involve modifying the WeensyOS kernel to start some processes on bootup, making those processes allocate memory on the heap, and then making a copy of the kernel memory mappings. This stencil is very similar to the stencil you'll use for Project 4. --- # Assignment ## Assignment Installation First, open a terminal in your container environment and `cd` into your `labs` folder, then ensure that your repository has a `handout` remote. To do this, type: ```bash $ git remote show handout ``` If this reports an error, run: ```bash $ git remote add handout https://github.com/csci0300/cs300-s25-labs.git ``` Then run: ```bash $ git pull $ git pull handout main ``` This will merge our Lab 3 stencil code with your previous work. :::warning **You may get a warning about "divergent branches"** like shown below: ![](https://i.imgur.com/BeYsMKC.png) This means you need to set your Git configuration to merge our code by default. To do so, run (within the same directory in your labs repository workspace): ```console $ git config pull.rebase false $ git pull handout main ``` ::: If you have any "conflicts" from Lab 3 (although this is unlikely), resolve them before continuing further. Run `git push` to save your work back to your personal repository. ## Starting up WeensyOS WeensyOS can, in principle, run on any computer with an x86-64 CPU architecture (like yours, most likely). For this lab, however, you will run the OS in QEMU, a CPU emulator. QEMU makes it possible to easily restart and debug WeensyOS, and you will use it in Project 4 as well. :::success **Task:** Run WeensyOS! To do so, run: * `make run` in the `lab4` directory of your lab repository. * OR `make run-console` if QEMU’s default display causes accessibility problems. ::: You should see a screen similar to the one below, which indicates that the WeensyOS kernel has booted on the computer simulated by QEMU. ![](https://i.imgur.com/hdnEznM.png) Nothing much is happening here! To exit, press Ctrl-C. Now, let's look at some code. Since WeensyOS is a complete operating system, the stencil code for this lab is quite large -- but rest assured, you will only need to look at a few files for now. Open `kernel.cc` in your editor. This file contains the implementation of the WeensyOS kernel. Find the function `void kernel(const char* command)`. This function is what runs once the bootloader hands over control to the WeensyOS kernel. At the bottom of the function, you will find the message you saw on the screen. <details> <summary><i><span style="color:#0000EE">Optional: How does <code>kernel()</code> get called?</span></i> </summary> <br /> >If you're curious how the boot process works, take a look at <code>boot.cc</code> and read the comment at the top of the file. You don't have to understand the details of the bootup process for either this lab or the WeensyOS project. > >In a nutshell: >1. When the computer turns on, it runs the code in <code>bootentry.S</code>, which eventually calls the <code>void boot()</code> function in <code>boot.cc</code>. >2. The bootloader reads the kernel binary (in <i>Executable and Linkable</i> (ELF) format) from disk into memory. >3. It then jumps to the <code>kernel_entry</code> label in the assembly code in <code>k-exception.S</code>. >4. At the end of that short bit of assembly code, the <code>jmp _Z6kernelPKc</code> instruction jumps to the location at which the compiler placed the <code>kernel</code> function (<code>_Z6kernelPKc</code> is the <i>mangled</i> symbol name for <code>kernel</code>). </details> <br /> ## Starting Processes in WeensyOS Now let's start some processes! Unlike the OS on your computer, the basic WeensyOS we're using for this lab does not allow the user to specify what program they would like to run. Instead, it will run a fixed set of programs at bootup. Take a look at `process_setup` in `kernel.cc`: > #### **`void process_setup(pid_t pid, const char* program_name)`** > [color=#4358f2] This function is responsible for setting up kernel state for a new userspace process: for example, it loads the program (via the `program_loader`, an ELF binary loader we're providing you with). It then sets the instruction pointer and stack pointers, and finally marks the new process as runnable. **Read through it now; you will need to understand and modify this function in Project 4.** :::success **Task:** In the `kernel` function in `kernel.cc`, we'll call `process_setup` to set up a new process: 1. Remove the `console_printf` call in the `kernel` function, and instead add a call to `process_setup` with these parameters: * `pid_t pid`: You will need to pass a process ID (PID). PIDs for user processes will start at `1`, as the kernel is PID `0`. * `const char* program_name`: Use `"allocator"` as your program_name. The WeensyOS programmer loader recognizes a specific set of program names. For example, when asked to load `"allocator"` into memory, the program loader will load compiled source code based on `"p-allocator.cc"` into a specific address of physical memory. 2. Finally, you need to tell the kernel to actually run a process and jump to user mode. For that, call `run(&ptable[1])`, which starts PID 1. ::: Once you complete this step and run your WeensyOS, you should see a screen similar to the one below. ![](https://i.imgur.com/augIEV8.png) Whoa :blush:! WeensyOS is displaying some helpful information to you here. Let's look at what we've got. The output visualizes the contents of the machine's physical memory (i.e., RAM). * The addresses listed on the left are the physical memory addresses of the leftmost character on each line. * Each character represents a 4 KB region of memory (a "page" of physical memory), and its color and letter indicate what the memory is used for. The picture below explains the meaning of different characters. Your current WeensyOS will only have characters for the (red) process 1. ![](https://i.imgur.com/eSZro4H.gif) Looking more closely at the regions of memory that are in use here, the second picture explains what they're for. You'll see that each process has code and data pages, which the program loader puts in memory, and a stack (at the far end of its row), which `process_setup` put in place. ![](https://i.imgur.com/DDJGNYo.gif) Now that you understand the output, let's have some more processes. :::success **Task:** Create three more processes in the `kernel` function. Use program_names `"allocator2"`, `"allocator3"`, and `"allocator4"` when you start them. This makes the program loader load the same compiled source code, `p-allocator.cc`, but into different regions of memory. _Note: you can leave the call to `run(&ptable[1])` unmodified. It just asks the kernel to enter the kernel process scheduler and try to schedule process 1. If process 1 isn't runnable or yields back, the kernel will automatically pick another process to run._ ::: Once you've completed this step, you should see the following output on the screen. ![](https://i.imgur.com/64Vgxuz.png) It looks like the processes are running, but they're not actively using any heap memory. The code for your processes is in `p-allocator.cc`: > #### `p-allocator.cc` > This code runs in **user space** (i.e., outside the kernel). Right now, each process figures out where its heap and stack are, and then loops doing nothing. We can make the processes allocate some memory using the `sys_page_alloc` system call. The signature and specification of `sys_page_alloc` is in `u-lib.hh`. **Go and read it.** > #### `int sys_page_alloc(void* addr)` > The implementation simply invokes `make_syscall`, which executes the inline assembly code in the same file. You don't need to understand this code in detail, but note that it invokes the `syscall` x86-64 assembly instruction. This instruction transfers control into the kernel. <details> <summary><i><span style="color:#0000EE">Optional: How does the WeensyOS kernel handle the system call?</span></i></summary> <br /> >The <code>syscall</code> instruction raises a trap, which is ultimately handled by <code>uintptr_t syscall(regstate* regs)</code> in <code>kernel.cc</code>. The <code>switch</code> statement in that function figures out what system call was made (you can see several others there!), and then calls <code>syscall_page_alloc</code>. > ><code>syscall_page_alloc</code> is implemented in <code>kernel.cc</code>, and it's <b>different</b> from the user-space function <code>sys_page_alloc</code> (which we saw in `u-lib.hh`). `syscall_page_alloc` has a very basic implementation in this lab, which you'll modify in the project! </details> <br /> :::success **Task:** Change `p-allocator.cc` to loop allocating successive heap pages via `sys_page_alloc`, as specified in `process_main`. :::warning **Aside:** Be sure to not delete the following three lines of code at the end of `process_main`: ```c= while(true) { sys_yield(); } ``` These lines are critical since they ensure that `process_main` returns and exits in a controlled manner. "Stopping" a program is actually something that needs to be implemented, it doesn't happen automatically. This functionality is normally implemented via the `exit` system call, which gets called implicitly when`main()` returns in normal C programs and lets the kernel take control and clean up the process's resources. But, in WeensyOS, `exit` is not implemented, so when `process_main` returns, the processor just executes the next location in memory as a machine instruction. Therefore, we need the continual calls to `sys_yield` in order to allow the WeensyOS kernel to take over properly. ::: If you run WeensyOS after completing this change, you should see your four processes rapidly fill up the memory from left to right, with the end result being similar to what's shown below. ![](https://i.imgur.com/0ZWSIaQ.png) If you instead see WeensyOS crash with a scary error similar to the picture below, think about when the loop allocating pages needs to stop, and what could go wrong if it does not! ![](https://i.imgur.com/SUCaoMB.png) Now, once you've got this right, let's figure out why some numbers are white. The processes asked for memory and the kernel allocated it, but they haven't actually done anything with that memory. The white numbers indicate that there is no data in the first four bytes of the page (they are all zeros, as the kernel zeros the page when allocating it). :::success **Task:** Make your `p-allocator.cc` write the PID of the current process into the first four bytes of each page it allocates. <details><summary><i>Hint:</i></summary> _Any physical page is directly accessible to userspace in this version of WeensyOS. Consider dereferencing a pointer to write to a page._ </details> ::: Now your output should be all colored: ![](https://i.imgur.com/kaTYpY0.png) So far, the console is only showing you the physical memory. In the lab version of WeensyOS -- as well as in the initial stencil for Project 4 -- there is no isolation between processes. In other words, process 1 could totally trample all over the memory of the other processes! In Project 4, you'll implement virtual memory to support better process isolation. ## Virtual Memory Virtual memory is the hardware mechanism that allows multiple processes on a computer to safely share physical memory. "Safely" here means that processes should not be able to write all over other processes' memory -- and that includes the kernel's memory, as the kernel is also a process (with PID 0 in WeensyOS). You will have learned (or are soon about to learn) about virtual memory in lectures! <details> <summary><i>Need a refresher on virtual memory? Expand to read the details.</i></summary> > The kernel --- which needs complete privilege over all operations of the machine, and must be protected at all costs --- must run without inappropriate interference from user-level processes, which have no privilege and might have bugs. The kernel's instructions and data are stored in the computer's physical memory. We need some way to partition the physical memory so that some parts of it are inaccessible to unprivileged processes. Different operating systems have different memory layouts, but for concreteness, let’s look at a specific layout based on the one from WeensyOS. Here’s how “physical memory” might be laid out: ``` +----------------------------------------------------------------------------+ | | Kernel data | | Unprivileged data | | +----------------------------------------------------------------------------+ ^ ^ 0x40000 0x100000 ``` How can we protect the kernel's data from access by unprivileged processes? One natural solution is **segmentation**. That is, the kernel program can access any kind of address, whereas user programs can only access a specific range of address. If the user program violates this rule, then it needs to give control back to the kernel using fault (this is where *segmentation fault* comes from!). However, the problem with segmentation is its inflexibility. For example, suppose that process *P1* wants to allocate more heap memory from the operating system. If this is the current memory map: ``` +----------------------------------------------------------------------------+ |..........| K |...............| P1 |......................| +----------------------------------------------------------------------------+ ^ ^ 0x40000 0x100000 ``` where the dots are free, then it’s safe to allocate more memory for *P1*: ``` +----------------------------------------------------------------------------+ |..........| K |...............| P1 |.................| +----------------------------------------------------------------------------+ ^ ^ 0x40000 0x100000 ``` But what if another program, *P2*, was allocated close by? ``` +----------------------------------------------------------------------------+ |..........| K |...............| P1 |..| P2 |........| +----------------------------------------------------------------------------+ ^ ^ 0x40000 0x100000 ``` Of course, *P2*’s data should be inaccessible to *P1*, and vice versa. Unfortunately, if we want to give each process a contiguous block of memory, this means that now *P1*’s ability to grow is limited by the nearby placement of *P2*. We therefore need a new layer of indirection, and this layer is called **virtual memory**. The purpose of the layer is to divorce the addresses visible to process instructions from the addresses in physical memory. We want a flexible mapping between the addresses visible to processes, which we call virtual addresses, and the addresses corresponding to physical memory chips. Since different processes have different permissions to access memory, they must have different virtual address spaces. For example, here’s a view of memory with two processes *P1* and *P2* (where *P1* contains some data "AB"): ``` 0x90000 v +--------------------------------------+ |........| P1 "AB" |..................| +--------------------------------------+ \ \ 0x90000 \ v \ +-----------------------+ \ |.......| P2 |..........| \ +-----------------------+ \ / \ / \ / \ / +-------------------------------------------------------------------+ |....| K |..................| P1 "AB" |....| P2 |...............| +-------------------------------------------------------------------+ ^ ^ 0x100000 0x140000 ``` Note that the addresses in these different address spaces can differ from physical addresses, and in fact, the same virtual address, when accessed in different processes, can refer to different physical memory. Now, if *P1* wants to borrow more memory for some new data "CD", it doesn’t matter that *P2* is restricting its growth in contiguous physical memory. The kernel can allocate *P1* new physical memory from anywhere, but make it appear contiguous to *P1* by arranging mappings carefully: ``` 0x90000 0x120000 v v +--------------------------------------+ |........| P1 "AB" "CD" |.............| +--------------------------------------+ \ / \ / 0x90000 \ / v \ / +-----------------------+ \/ |.......| P2 |..........| /\ +-----------------------+ / \ / / \ / / \ / / \ / +-------------------------------------------------------------------+ |....| K |.......|"CD"|......| P1 "AB" |....| P2 |..............| +-------------------------------------------------------------------+ ^ ^ ^ 0x70000 0x100000 0x140000 ``` </details> <br /> Let's enable displaying virtual memory on the console. :::success **Task:** In `kernel.cc`, edit the `memshow` function and change the value of `show_virtual` to 1 to display the virtual address space. Then run your WeensyOS. ::: You'll now see a picture like the following (though your processes might allocate memory faster). ![](https://i.imgur.com/85jgtFb.gif) Our current WeensyOS has no virtual memory at all. The bottom part of the console output cycles through the virtual memory view for your four processes in turn. Evidently, all processes see identical views of memory, and that view is exactly what's in physical memory. Moreover, the characters in the virtual memory display are <span style="background-color:#cccccc">**inverted**</span>, which indicates that the process is allowed to access the corresponding address. This implies that memory is not properly isolated: for example, all four processes can access the kernel memory (<span style="background-color:#ff00ff">**K**</span>), and process 1 can access process 2's data (<span style="background-color:#00ff00">**2**</span>)! In Project 4, you will add memory isolation via virtual memory. To do so, you will give processes their own _page tables_, which translate virtual addresses (which the user-space processes use) into physical addresses (in RAM). ### Page Tables We know how important and useful virtual memory is --- but how do we actually translate a given virtual address to the underlying physical address? This is where page tables come in. A *page table* is a tree-like data structure consisting of pages that store mappings between virtual addresses and physical addresses. When a process tries to access a virtual address, the CPU looks up the real, physical address in the page table and translates the address that actually gets accessed. Many processors use multi-level page tables to perform translations. Specifically, x86-64 and WeensyOS have a *four*-level page table. Recall that each physical page is 2^12^ (4096) bytes in size (= 4KB), and since each address is 64 bits (= 2^3^ bytes), each page can contain 2^9^ addresses. In a four-level page table, the top-level (level 4) page table contains addresses of level 3 page tables. Each level 3 page table contains addresses of level 2 page tables, each level 2 page table contains addresses of level 1 page tables, and lastly, each level 1 page table contains actual physical addresses. We won't go into the specifics of how exactly virtual address translation is performed here, and you do *not* need to know them for this lab and the project. But if you're curious, definitely refer to lecture materials, research on your own, or ask the TAs! ### Memory Iterators WeensyOS abstracts the details of x86-64 page tables away from you, and instead gives you high-level interfaces for dealing with virtual memory and processing page tables: `vmiter` and `ptiter`, which are C++ iterators that allow you to conveniently loop over virtual memory mappings and page tables. The `vmiter` helps to iterate through pages in virtual memory, and lets you check the underlying, destination physical pages for properties like permissions and physical addresses. On the other hand, the `ptiter` helps you iterate through *page table* pages (yes, page tables are themselves stored in pages -- it's pages all the way down!). We picked out some useful methods for `vmiter` to walk through together below (to see the full declaration and implementation of the interfaces, see `k-vmiter.hh` and `k-vmiter.cc`). There's a couple of methods not mentioned here, but feel free to read about them on your own and use them wherever you see fit. #### `vmiter` ```cpp class vmiter { public: inline vmiter(x86_64_pagetable* pt, uintptr_t va = 0); // initialize a `vmiter` for page table `pt`, //pointing to the virtual address `va`. If `va` is not provided, it defaults to 0. inline vmiter(const proc* p, uintptr_t va = 0); inline uintptr_t va() const; // current virtual address inline uint64_t pa() const; // current physical address inline uint64_t perm() const; // current permissions inline bool present() const; // is va present? inline bool writable() const; // is va writable? inline bool user() const; // is va user-accessible (unprivileged)? inline vmiter& operator+=(intptr_t delta); // advance `va` by `delta` // map current va to `pa` with permissions `perm` // Current va must be page-aligned. Calls kalloc() to allocate // page table pages if necessary. Panics on failure. void map(uintptr_t pa, int perm); // like `map`, but returns 0 on success and -1 on failure int try_map(uintptr_t pa, int perm); }; ``` You can initialize a `vmiter` object by passing in either a pointer to a process's kernel metadata (in which case it will figure out the page table for that process), or a pointer to an x86-64 page table. Here is an example that checks if the address `0x040000` is mapped in the kernel page tables (you can try this example somewhere in your `kernel.cc`!): ```cpp // `kernel_pagetable` is in kernel.hh x86_64_pagetable* pt = kernel_pagetable; uintptr_t pa = vmiter(pt, (uintptr_t)0x040000).pa(); // returns uintptr_t(-1) if unmapped assert(pa != (uintptr_t)-1); ``` The `vmiter(pt, (uintptr_t)0x040000)` part creates a `vmiter` object that examines the virtual address `0x040000` in the page table `pt`. The `pa()` method on this object returns the corresponding physical address `pa`. The `vmiter` can also find out the permission associated with the current page it is looking at. In `x86-64.h`, the following permission flags (bits) are defined: * `PTE_P` should be set if a page is present in a page table * `PTE_W` should be set if a page is writable * `PTE_U` should be set if a page is user-accessible The `vmiter` can then use the following functions to determine the permissions of the current page: * `present()` returns if the current virtual address is present (i.e., if the `PTE_P` flag is set) * `writable()` returns if the current virtual address is present *and* writable (i.e., if the `PTE_P | PTE_W` flag is set) * `user()` returns if the current virtual address is present *and* user-accessible (i.e., if the `PTE_P | PTE_U` flag is set) For example, this snippet of code checks if the `va` is present and writable `(PTE_P | PTE_W)` in `pt`: ```cpp if (vmiter(pt, va).writable()) { ... } ``` You can also use `vmiter` with a `for` loop, calling methods that query its state and methods that change its current virtual address. For example, this loop prints all present mappings in the lower 64KiB of memory: ```cpp for (vmiter it(pt, 0); it.va() < 0x10000; it += PAGESIZE) { if (it.present()) { log_printf("%p maps to %p\n", it.va(), it.pa()); } } ``` Note that this loop iterates one page at a time: the `it += PAGESIZE` expression increases `it.va()` by `PAGESIZE`. The `map()` and `try_map()` methods are used to *add* mappings to a page table. They also allocate actual page table pages on demand (e.g., the second and third level page tables mentioned above). The key difference between `try_map` and `map` is that `try_map` returns -1 on failure, while `map` panics on failure. In Project 4, it's up to you to decide which is appropriate to call! The following snippet maps virtual address `0x2000` to the physi[](https://)cal address `0x3000` with all permissions in the page table pointed to by `pt`: ```cpp int r = vmiter(pt, 0x2000).try_map(0x3000, PTE_P | PTE_W | PTE_U); // r == 0 on success, r < 0 on failure ``` ::::success **Task:** Now that you're more familiar with how `vmiter` works, **use `vmiter` to implement a function that copies virtual memory mappings from a source page table to a destination page table.** Write this function inside `kernel.cc` in your WeensyOS codebase, and use the global `kernel_pagetable_copy` as your destination, and global `kernel_pagetable` as your source. Call `copy_mappings` somewhere in the `kernel` function after the `kernel_pagetable` is initialized. **For grading, your `copy_mappings` function must also print out some information about the mapped pages using `log_printf`**, as shown below. This gets you familiar with `log_printf`, which you may use for debugging in WeensyOS (it writes to a file called `log.txt` in your `lab4` directory), and we will use the output to check off your lab. Set the function comment below for an example of what to print, and what the output should look like. ```cpp void copy_mappings(x86_64_pagetable* dst, x86_64_pagetable* src) { // Copy all virtual memory mappings from `src` into `dst` // for addresses in the range [0, MEMSIZE_VIRTUAL). // You may assume that `dst` starts out empty (has no mappings). // For our grading purposes, use the following line to print out // the physical and virtual addresses you're mapping, as well as the // state of the present, writable, and user-accessible permission // bits (in that order). Your log_printf call should start // exactly like this: // log_printf("VA %p maps to PA %p with PERMS %p, %p, %p\n", ...); // // Make sure to use the exact same format string above, but fill // out the rest of the line. For the permissions bits, print 1 // if the corresponding permission bit is set, and 0 of it is not. // See hint below for details. // After this function completes, for any virtual address `va` with // 0 <= va < MEMSIZE_VIRTUAL, `dst` and `src` should map that va // to the same physical address with the same permissions. } ``` :::info <details><summary><b>Hint!</b></summary> In order to access check the specific bits of the permissions, recall the PTE_P, PTE_W, and PTE_U flags. You will need to write an expression to check if a particular bit in a value is set or not. Refer to the section "Hint: More info on bitwise operators" in the [Project 1: Snake handout](https://csci0300.github.io/assign/projects/project1.html) for a reminder on bitwise operators. Once you can figure out if a bit is set, you will need to print either a 1 or 0 for each flag. You can do this any way you want, but a nice shorthand for this (and some cool new C syntax!) is C's [*conditional operator*](https://en.wikipedia.org/wiki/Ternary_conditional_operator#Usage), which lets you write an if/else statement in on line. Here's an example in context for what you need here: ```c void example(int x) { int result = (x > 5) ? 100 : 0; printf("%d\n", result); // Can also write as printf(..., (x > 5) ? 100 : 0) } example(42); // Prints 100 example(1); // Prints 0 ``` ::: </details> <details><summary>When is copying mappings from one page table to another useful?</summary> > When you start a new process with separate page tables for memory isolation (you'll do this in Project 4!), you will want to give it a copy of the kernel's or parent process's page table by copying that page table into a new one. This is also important when you implement the `fork` system call in Project 4 (however, the copying process will be more involved than the above function)! </details> :::: --- # Handin instructions Turn in your code by pushing your git repository to `github.com/csci0300/cs300-s25-labs-YOURUSERNAME.git`. Then, head to the [grading server](https://cs300.cs.brown.edu/grade/2025). On the "Labs" page, use the **"Lab 4 checkoff"** button to check off your lab. :::warning :warning: :alarm_clock: **Please note**: **the checkoff for this lab runs for almost a minute without producing any output**, as it runs your WeensyOS in the background. Don't be alarmed if the checkoff script seems "stuck" -- it will complete after about a minute and show you the results. If you don't see any updates in a few minutes, try refreshing the page. ::: --- <small>_Acknowledgements:_ WeensyOS was originally developed for Harvard's CS 61 course. This lab was developed for CS 300.</small>