Lecture 9: Assembly Language
π₯ Lecture video (Brown ID required)
π» Lecture code
β Post-Lecture Quiz (due 11:59pm, Wednesday, February 28).
Assembly Language
Registers
Assembly instructions operate on registers, small pieces of very fast memory inside the processor. To process data stored in memory, the processor first needs to load it into registers; and once it has completed working on the data in a register, it needs to store it back to memory.
Registers are the fastest kind of memory available in the machine. x86-64 has 14 general-purpose registers and several special-purpose registers. The table below lists all basic registers, with special-purpose registers highlighted in yellow. You won't understand all columns yet, but you will soon and can then use this table as a reference (we won't ask you to memorize it in detail). You'll notice different naming conventions for subsets of the same register, a side effect of the long history of the x86 architecture (the first x86 processor, the 8086 was first released in 1978).
Full register name | 32-bit (bits 0β31) |
16-bit (bits 0β15) |
8-bit low (bits 0β7) |
8-bit high (bits 8β15) |
Use in calling convention | Callee-saved? |
---|---|---|---|---|---|---|
General-purpose registers: | ||||||
%rax | %eax | %ax | %al | %ah | Return value (accumulator) | No |
%rbx | %ebx | %bx | %bl | %bh | β | Yes |
%rcx | %ecx | %cx | %cl | %ch | 4th function argument | No |
%rdx | %edx | %dx | %dl | %dh | 3rd function argument | No |
%rsi | %esi | %si | %sil | β | 2nd function argument | No |
%rdi | %edi | %di | %dil | β | 1st function argument | No |
%r8 | %r8d | %r8w | %r8b | β | 5th function argument | No |
%r9 | %r9d | %r9w | %r9b | β | 6th function argument | No |
%r10 | %r10d | %r10w | %r10b | β | β | No |
%r11 | %r11d | %r11w | %r11b | β | β | No |
%r12 | %r12d | %r12w | %r12b | β | β | Yes |
%r13 | %r13d | %r13w | %r13b | β | β | Yes |
%r14 | %r14d | %r14w | %r14b | β | β | Yes |
%r15 | %r15d | %r15w | %r15b | β | β | Yes |
Special-purpose registers: | ||||||
%rsp | %esp | %sp | %spl | β | Stack pointer | Yes |
%rbp | %ebp | %bp | %bpl | β | Base pointer (general-purpose in some compiler modes) |
Yes |
%rip | %eip | %ip | β | β | Instruction pointer (Program counter; called $pc in GDB) |
* |
%rflags | %eflags | %flags | β | β | Flags and condition codes | No |
Note that unlike primary memory (RAM) – which is what we think of when we discuss memory in a C/C++ program
– registers have no addresses! There is no address value that, if cast to a pointer and dereferenced, would
return the contents of the %rax
register. Registers live in a separate world from the memory, and we need
special instructions to move data to and from registers and memory.
Whenever you see %ZZZ
in assembly code, this refers to a register named ZZZ
. The x86-64
registers have confusing names because they evolved over time; each register also has multiple names that
refer to different subsets of its bits. For example %rax
, one of the general-purpose registers that is,
by convention, used to pass return values from functions, is split into the following five names:
63 31 15 7 0 +-------------------------------+-------------------------------+ | | | | | +---------------------------------------------------------------+ |---------------------%rax (64 bits/8 bytes)--------------------| |-----%eax (32 bits/4 bytes)----| |-%ax (16b/2B)--| |--%ah--|--%al--| <-- 8 bits/1 byte each
Assembly instructions often have a suffix that indicates what input data size and register width they're
operating on. For instance, a set of "move" instructions help load signed and unsigned 8-, 16-, and 32-bit
quantities from memory into registers. movzbl
, for example, moves an 8-bit quantity (a byte) into
32-bit register (a longword; e.g., %eax
) with zero extension; movslq
moves a 32-bit
quantity (longword) into a 64-bit register (quadword; e.g., %rax
) with sign extension.
What's up with long suddenly meaning 32-bits (4 bytes)?
Because of wonderful history of the x86 architecture, and to confuse you, a "long" in x86-64 hardware terms does not refer to the same things as a
long
integer type in C. Specifically, a x86-64 assembly long is 4 bytes, so it corresponds to a Cint
. The 8-byte long (or indeed any pointer type) in C uses "quad" instructions in x86-64 assembly, denoted by a q suffix.
Note that what looks like types (such as long
, short
, etc.) here merely refers to the
register width used in the instruction. All actual types are removed from the program during compilation; there
are no types in assembly (for examples, see asm06.s
and asm07.s
and their corresponding
C source files in the lecture code).
Instructions
There are three basic kinds of assembly instructions. We'll see two today, and another next time.
- Computation: These instructions computate on values, typically values stored in registers. Most have
zero or one source operands and one source/destination operand, with the source operand coming first. For example,
the instruction
addq %rax, %rbx
performs the computation%rbx := %rbx + %rax
. - Data movement: These instructions move data between registers and memory – so they can move values
from one register to another, from memory into a register, and from a register back to memory. Almost all move
instructions have one source operand and one destination operand; the source operand comes first. For example,
movq %rax, %rbx
copies the contents of%rax
into%rbx
, so it performs the assignment%rbx = %rax
. - Control flow: Normally the CPU executes instructions in sequence and in the order they appear in the
assembly code (and, once translated into bytes, the order in memory). Control flow instructions change the next
instruction the processor executes (something called the "instruction pointer", and stored in special
register
%rip
). There are unconditional branches (the instruction pointer is set to a new value), conditional branches (the instruction pointer is set to a new value if a condition is true), and function call and return instructions.
Some instructions appear to combine computation and data movement. For example, given the C code int* pi; ...
++(*pi);
the compiler might generate incl (%rax)
rather than movl (%rax), %ebx; incl %ebx;
movl %ebx, (%rax)
. However, the processor actually divides these complex instructions into tiny, simpler,
invisible instructions called microcode, because the simpler instructions can be made to execute faster.
The complex incl
instruction actually runs in three phases: data movement, then computation, then data
movement. This matters when we introduce parallelism.
Different assembly syntaxes
There are actually multiple ways of writing x86-64 assembly. We use the "AT&T syntax", which is distinguished from the "Intel syntax" by several features, but especially by the use of percent signs for registers. Sadly, and just to make things more confusing, the Intel syntax puts destination registers before source registers.
Summary
Today, we continued our journey into how the computer operates at the level just below C code: assembly language.
We saw that in assembly, there are computation and data movement instructions, and that the compiler often produces somewhat unexpected instruction sequences to make things faster. This is part of why we use compilers: they are incredibly smart at distilling our programs down into the fastest possible sequence of instructions.
Next time, we will look at how function calls work and how the assembly instructions actually manage the automatic lifetime memory in the stack segment. After that, we will leave the low-level world of assembly and start moving up the systems stack (no pun in intended)!