What’s on the stack?
2015-07-10
This is the third post in my series on Grokking xv6. We will use GDB to understand how the stack works in detail. At the end of the post there’s a video where we trace a system call from user space to kernel space and back.
Introduction
I want to begin with a word of warning. In this article there’ll be a lot of things that you’ll see that we won’t explain. Instead, we are going to learn how to squint at the code, suppress detail and focus on what we want to do. There are a lot of guides out there that present a fluffy view of reality, but I want to stay as faithful as possible to what you would actually see using GDB and a real code base.
One of the hardest parts so far going through xv6 was not having a good mental model of the stack and how it operates, I will attempt to communicate at least a part of this to you.
Code
We are going to look at some assembly code. I do not expect you to understand everything. Instead we will look at a few things and see what they can teach us about the stack.
When we type ls
in a terminal, the shell takes
care of parsing the input and then passes the control to the
ls
program, ls.c
, which is written in
C. Here’s the main function of that file:
int
main(int argc, char *argv[])
{
int i;
if(argc < 2){
ls(".");
exit();
}
for(i=1; i<argc; i++)
ls(argv[i]);
exit();
}
This is the main entry point of the ls
program.
argc
stands for argument count, and if
main doesn’t get a second argument (ls
counts as
the first argument to main) it will call the function
ls
with the argument .
. After that it
will call the exit
function.
We can get the assembly code for our main function with the
disassemble
command in GDB. GDB is a
debugger that allows us to see what is going on inside a program
when it executes. We will see a lot of “code boxes” as we move
on. Lines starting with (gdb)
are lines that we
write as input, and the other lines are things GDB shows us.
(gdb) disassemble main
Dump of assembler code for function main:
=> 0x00000304 <+0>: push %ebp
0x00000305 <+1>: mov %esp,%ebp
0x00000307 <+3>: and $0xfffffff0,%esp
0x0000030a <+6>: sub $0x20,%esp
0x0000030d <+9>: cmpl $0x1,0x8(%ebp)
0x00000311 <+13>: jg 0x324 <main+32>
0x00000313 <+15>: movl $0xb5f,(%esp)
0x0000031a <+22>: call 0xb0 <ls>
0x0000031f <+27>: call 0x5cb <exit>
0x00000324 <+32>: movl $0x1,0x1c(%esp)
0x0000032c <+40>: jmp 0x34d <main+73>
0x0000032e <+42>: mov 0x1c(%esp),%eax
0x00000332 <+46>: lea 0x0(,%eax,4),%edx
0x00000339 <+53>: mov 0xc(%ebp),%eax
0x0000033c <+56>: add %edx,%eax
0x0000033e <+58>: mov (%eax),%eax
0x00000340 <+60>: mov %eax,(%esp)
0x00000343 <+63>: call 0xb0 <ls>
0x00000348 <+68>: addl $0x1,0x1c(%esp)
0x0000034d <+73>: mov 0x1c(%esp),%eax
0x00000351 <+77>: cmp 0x8(%ebp),%eax
0x00000354 <+80>: jl 0x32e <main+42>
0x00000356 <+82>: call 0x5cb <exit>
This is a lot of code, and we are not going to understand all of it in this post. Here’s how to read it. The first line is the following:
=> 0x00000304 <+0>: push %ebp
=>
means that this is the instruction we are
about to execute, but haven’t yet. 0x00000304
is
the address of the instruction written in
base 16 or hexadecimal or hex - this
is where the instruction lives in memory.
<+0>
is an offset that we are going to
ignore. push
is a mnemonic or an
instruction, and its argument in this case is
%ebp
. We will talk more about instructions and what
their arguments mean in later sections.
How does hexadecimal work? The decimal alphabet
consists of the numbers 0 through 9, and the binary
alphabet consists of 0 and 1. Similarly, the hexadecimal
alphabet has 16 “numbers” - 0 through 9 and then A through F.
For example, 14 would be written simply as “E”, and 17 as “11”.
To differentiate between hexadecimal numbers and decimal we use
the prefix “0x”, so 11 is 11 but “0x11” is 17.
Addresses in the computer’s memory can be very long, so
instead of writing 0x00000304
we can omit the
leading zeroes and write 0x304
instead.
Without knowing anything about assembly, you might notice
that the call
instruction occurs several times, and
that its arguments is either <ls>
or
<exit>
. These call
instructions
correspond to the four function calls we see in our main
function.
The Stack
If we follow the execution of a program by pointing at the screen we might say “this calls this, which calls this, then it returns here”. The stack is how the computer keeps track of things like this - where it came from, what the arguments of a function are, etc.
Let’s have a look at what’s on the stack before we have
executed a single instruction. We can do this using GDB’s
x
command, which allows us to inspect memory at a
given address. It takes two arguments: a format and an
address.
(gdb) x /x $esp
0x2fe8: 0xffffffff
In this case /x
is the format and
$esp
is the address. We see that $esp is set to
0x2fe8
and the content of it is
0xffffffff
. This is what’s on top of the stack. We
can see a bit more of what’s on the stack with the
following.
(gdb) x /4x $esp
0x2fe8: 0xffffffff 0x00000001 0x00002ff4 0x00002ffc
Here the format /4x
means “show me 4
words in hex”. In GDB, a word is 4 bytes (and
a byte is always 8 bits). Addresses are one word, or 32 bits,
which is why it’s called a 32-bit architecture.
Registers are also 32 bits.
Registers store data for the CPU for easy access.
esp
is a special register called
(extended) stack pointer, and $esp
is the
way you reference it in GDB.
At any given time, the stack pointer points to the top of the
stack. In the above output, $esp
is set to
0x2fe8
, and it points to 0xffffffff
.
We can also write the above in a slightly more verbose way, for
clarity. Each address on the left is offset by four from the one
above it.
0x2fe8: 0xffffffff <--- ESP
0x2fec: 0x00000001
0x2ff0: 0x00002ff4
0x2ff4: 0x00002ffc
In the abstract sense, a stack is a LIFO (last in first out) data structure that supports two primary operations: push and pop. It is normally conceptualized as a stack of plates, growing upwards. What might be confusing in this context is that the top of the stack has the lowest address. The stack grows down, in terms of addresses, but the last touched entry is still called the top of the stack. This might be counterintuitive, but “top” is an abstract term and it doesn’t become less of a stack because it’s growing down.
Recall that the four lines above show what’s on the stack before we have even executed a single instruction in our main function. What do they mean? Before a function is called, its arguments are pushed onto the stack, in reverse order, followed by the return address (also known as as the return program counter or return PC). This way of modifying the stack is part of a calling convention. There are other calling conventions, but this is the most common one for C.
Our main function takes two arguments, argc and argv, whose values are:
0x2fe8: 0xffffffff <- -1, return address for main
0x2fec: 0x00000001 <- 1, argc value
0x2ff0: 0x00002ff4 <- argv, pointer to argv[0]
0x2ff4: 0x00002ffc <- "ls", value of argv[0]
We can cast 0x00002ffc
to a char*
(a string) to see its human readable representation. Likewise,
we can do the same for main’s return address,
0xffffffff
:
(gdb) print (char*)0x00002ffc
$83 = 0x2ffc "ls"
(gdb) print (int)0xffffffff
$88 = -1
main
has a fake return address. Since it has no
place to return to, it is simply set to -1.
One instruction at a time
We are now going to step through the beginning of the program, instruction by instruction. We can see the first four instructions to be executed:
(gdb) x /4i $eip
=> 0x304 <main>: push %ebp
0x305 <main+1>: mov %esp,%ebp
0x307 <main+3>: and $0xfffffff0,%esp
0x30a <main+6>: sub $0x20,%esp
We use the /4i
format to interpret the content
of the four addresses, starting at $eip
, as
instructions. There are no guardrails here - GDB doesn’t care if
you read memory that represents, say, a number as an
instruction.
$eip
is a register that’s called an
(extended) instruction pointer. This points to the next
instruction we are about to execute. These four instructions are
called the function prologue. What does the stack look
like after we perform these instructions?
Let’s do it one by one using GDB’s stepi
function. After push %ebp
this is the stack:
0x2fe4: 0x00003fb8 <--- ESP
0x2fe8: 0xffffffff
0x2fec: 0x00000001
0x2ff0: 0x00002ff4
Two things have happened: we have moved the stack pointer up
an address, and put a new value there. The push instruction does
both of those things one after another. This is slightly tricky
as there’s no mention of $esp
in that instruction,
but yet it changed our stack pointer. There are only a few
instructions like these though: push
,
pop
, call
, ret
and a few
others. Another way to write push $ebp
is as
follows:
sub $0x4 $esp
mov $ebp ($esp)
This subtracts 4 from the stack pointer and puts
$ebp
in whatever the stack pointer is pointing to.
$esp
is the address of the stack pointer, and
($esp)
is what the stack pointer points to (just
like with pointers in C):
What is $ebp
? It stands for (extended) base
pointer (or sometimes it’s called a frame pointer,
depending on the architecture). The idea is that the stack
pointer changes throughout the function as variables and
registers are pushed and popped, but the base pointer stays the
same throughout. That means that we can use the base pointer as
an anchor to find parameters and local variables. For example,
if you look at the first assembly code listing, you can see that
there are things like 0x8(%ebp)
. This takes the
content of %ebp
but offset by 8, which is two
words, and in this case that’s where argc is located.
The next instruction after the push
instruction
is mov %esp, %ebp
nothing changes on the stack.
This moves our stack pointer into the base pointer
address.
The third instruction is and $0xfffffff0,%esp
,
and it’s a so called stack alignment (0xfffffff0
is
-16 in hex, using two’s complement). It ensures that the stack
pointer will be at its current position in memory, or at a lower
one, but more importantly that it will be at a 16-byte boundary.
Why this is done is outside of the scope of this article, but it
has to do with performance and being able to do several
instructions in parallel on certain architectures with something
called SIMD. We can see that the stack pointer is
evenly divided by 16 with print 0x2fe0 % 16
(or
just by looking at the last position in the hex number, since
that’s the “16”-th position). This is what the stack looks like
now:
0x2fe0: 0x00000000 <--- ESP
0x2fe4: 0x00003fb8
0x2fe8: 0xffffffff
0x2fec: 0x00000001
After sub $0x20,%esp
, which subtracts
0x20
(2 times 16 is 32 in decimal) from our stack
pointer, we’ve made room for 32 bytes on the stack. This is used
for local variables in main.
(gdb) x /12x $esp
0x2fc0: 0x00000000 0x00000000 0x00000000 0x00000000
0x2fd0: 0x00000000 0x00000000 0x00000000 0x00000000
0x2fe0: 0x00000000 0x00003fb8 0xffffffff 0x00000001
We will show the stack using the more concise format from now on.
We’ve now performed four instructions in main and this is the end of the function prologue. Something similar to this is done in every function. There’s also an function epilogue - and of course the main meat of the function in between.
Calling ls
If we step two more instructions (stepi 2
) we
get to:
movl $0xb5f,(%esp)
which puts 0xb5f on the stack without changing the stack
pointer (good thing we made room for local variables before so
we didn’t overwrite our stack!) What is that? Just like was done
to main
in the very beginning, before we call
ls
(the function, not the program - henceforth we
will just talk about the function ls
inside the
ls
program) we push the argument on the stack. We
know there’s only one argument and that it should be a string,
and we can see the true nature of it by casting it to a
char*:
print (char*)0xb5f
$2 = 0xb5f "."
(gdb) x /4x $esp
0x2fc0: 0x00000b5f 0x00000000 0x00000000 0x00000000
The next instruction is call 0xb0 <ls>
.
call
does two things, it pushes the address of the
next instruction in main
onto the stack, and then
it jumps to a subprogram (which is just another address
in memory, but since our program was compiled with debug
information on we can see that it says <ls>
).
A jump changes the instruction pointer to its argument. After
executing that instruction our stack and our upcoming four
instructions to execute looks like this:
(gdb) x /4x $esp
0x2fbc: 0x0000031f 0x00000b5f 0x00000000 0x00000000
(gdb) x /4i $eip
=> 0xb0 <ls>: push %ebp
0xb1 <ls+1>: mov %esp,%ebp
0xb3 <ls+3>: push %edi
0xb4 <ls+4>: push %esi
We are now in the ls
function, and
0xb0
, which was an argument to call, is what our
instruction pointer is set to. You might recognize the two first
lines from the prologue in our main function.
What about 0x0000031f
that is now on top of our
stack? It’s the address where the program should keep executing
once the ls function returns. We can confirm this by looking at
it’s memory location as instructions. These are exactly the
instructions that come after the call to ls (see the code
section above).
(gdb) x /4i 0x0000031f
0x31f <main+27>: call 0x5cb <exit>
0x324 <main+32>: movl $0x1,0x1c(%esp)
0x32c <main+40>: jmp 0x34d <main+73>
0x32e <main+42>: mov 0x1c(%esp),%eax
We are still in ls though. Let’s step two more instructions, pushing the base pointer onto the stack and moving (or “saving”) the stack pointer to the base pointer.
(gdb) x /4x $esp
0x2fb8: 0x00002fe4 0x0000031f 0x00000b5f 0x00000000
The new address on the stack is our old base pointer from main. We’ve seen this before, but let’s take a look again.
(gdb) x /4x 0x00002fe4
0x2fe4: 0x00003fb8 0xffffffff 0x00000001 0x00002ff4
This is exactly our stack at the beginning of main, so we now have easy access to that state for when we want to go back to it.
Skipping until the end of ls
There’s a lot that happens in ls
and the
functions it calls. We are going to skip all that by going to
the very last line of the C code in the ls function (code),
line 71. We do this with the GDB command until 71
.
When we are at the end of ls
, what’s about to
happen now and what does the stack look like?
(gdb) x /6i $eip
=> 0x2f9 <ls+585>: add $0x25c,%esp
0x2ff <ls+591>: pop %ebx
0x300 <ls+592>: pop %esi
0x301 <ls+593>: pop %edi
0x302 <ls+594>: pop %ebp
0x303 <ls+595>: ret
(gdb) x /4x $esp
0x2d50: 0x00000003 0x00002d88 0x00000010 0x00000001
We don’t know what those things on the stack are, but
presumably they come from something ls did - in fact, we would
have to run x /170x $esp
to begin to see addresses
that we recognize on our stack. That’s okay though, we are about
the clean that stack up with the function epilogue. Essentially
the five first instructions are the opposite of subtracting and
pushing things onto the stack. If we step through them with
stepi 5
we get:
(gdb) x /4x $esp
0x2fbc: 0x0000031f 0x00000b5f 0x00000000 0x00000000
Before we did that our stack pointer was set to
0x2d50
, and now it’s back to 0x2fbc
.
As a sanity check on what’s going on, we can see that everything
seems OK: we just cleaned up 620 bytes, of which the first
add $0x25c, %esp
took care of 604. pop
was called four times, and 4 times 4 bytes (the size of each
register that we popped) is 16, which takes care of the
rest.
print 0x2fbc - 0x2d50
$3 = 620
print 0x25c
$4 = 604
Our stack pointer is now back at 0x2fbc
, which
is where it was at the very beginning of the ls
function. The next instruction is a simple ret
.
What does ret
do? It’s the opposite of
call
: it pops off an address, and jumps to it. So
without single stepping, we should expect it to jump to
0x31f
and set the stack pointer to
0x2fbc + 4
= 0x2fc0
. What is
0x31f
again? It’s the next line in
main
after calling ls
. After single
stepping it:
x /4x $esp
0x2fc0: 0x00000b5f 0x00000000 0x00000000 0x00000000
x /4i $eip
=> 0x31f <main+27>: call 0x5cb <exit>
0x324 <main+32>: movl $0x1,0x1c(%esp)
0x32c <main+40>: jmp 0x34d <main+73>
0x32e <main+42>: mov 0x1c(%esp),%eax
Indeed, we are now back in main
and are about to
call exit
, and our stack is back to the same state
it was just before it was about to call ls
.
Stack traces
When you program in a high-level language you might come
across stack traces that show who called a function and
with which arguments. GDB also has support for this, and if we
were executing in the middle of the ls
function we
could see the following:
(gdb) info stack
#0 ls (path=path@entry=0xb5f ".") at ls.c:33
#1 0x0000031f in main (argc=1, argv=0x2ff4) at ls.c:79
Here we have two stack frames. A stack frame is created at every function invocation and contains arguments to the function, its return address, and space for local variables.
In this case we are in stack frame #0 (the inner frame),
executing instructions in ls
, and we came from
main. Notice that the base pointer we talked about earlier,
0x31f
, is there as the start of the other stack
frame (the outer frame).
You can see a stack trace just shows you all the stack frames, which we have meticulously (our rather, our compile has) constructed at the lower level. There’s no magic.
Conclusion and demo
We started off looking at a simple piece of C code and its equivalent assembly code. We then inspected the stack before a single instruction had been executed, and we then carefully went through the first few instructions in that function to see how the stack frame was set up. Then we saw how another function, ls, was called and how that changed the stack, preserving the stack state of the calling function. We then skipped a whole bunch of code only to catch our stack being cleaned up, and finally getting returned to the main function again with its stack frame intact.
Here’s the video that was promised. In it we trace a system call from user space to kernel space and back.