Lab 1: GDBFall 2024
This lab will introduce you to memory layout and GDB concepts that are critical for performing buffer-overflow attacks in the Application Security Project. You will not be actively attacking code in this lab, but instead dissecting an innocent C file to understand how it appears in memory when compiled and during execution.
Setup
Memory layout and buffer-overflow exploitation depends on details of the target system. For this lab and the corresponding project, you must create your solutions inside the Application Security Project VM, as it has been configured to standardize the stack layout and disable certain security features that would complicate your work.
-
Follow the setup instructions on the Application Security Project VM page. You only need to do this once; if you follow the guide for this lab, you can use the same VM and workflow for the project.
-
Create a new private repository from the the GitHub template. Clone your repository from GitHub inside the VM. You must do this in a folder in the native Linux filesystem. It won’t work correctly if you use a shared folder located in the host OS.
If you’re new to using Git/GitHub, check out this guide to help you get started.
Resources
Primer Videos
Before you begin the lab, ensure that you watch the following videos, parts 1 and 3 in particular. They contain a few tips to help you along the lab. They were designed with the Application Security Project in mind but you will find that the videos are useful here too.
You can find the videos’ slides here: https://files.gtinfosec.org/appsec_primer.pdf
GDB
You will make use of the GDB debugger for dynamic analysis within the
VM. Useful commands that you may
not know are disassemble
, info reg
, x
, and stepi
. See
the GDB help for details, and don’t be afraid to experiment. This quick
reference may also be useful:
https://users.ece.utexas.edu/~adnan/gdb-refcard.pdf.
x86 Assembly
These are many good references for Intel’s assembly language, but note that our project targets use the 64-bit x86_64 ISA (sometimes abbreviated to x64), not the older 32-bit x86 ISA. The stack is organized differently in x86_64 and x86. If you are reading any online documentation, ensure that it is based on the x86_64 architecture, not x86.
Also note that there are 2 different syntaxes for this assembly language, known as Intel syntax and AT&T syntax. They’re just 2 ways of expressing the same code. In this class we’re always using Intel syntax, but keep in mind that online resources might be using AT&T syntax. You can tell which is which because AT&T syntax uses percent signs (%
) everywhere and Intel syntax doesn’t.
Big versus Little Endian
The final task in this lab will involve endianness. Refer to this guide if you are unfamiliar with endianness or need a refresher. Also, there are helpful images you can find via Google that visually diagram this concept.
Tasks
You will write all answers in one file (submit.txt
). The line number for each task’s responses is indicated in bolded brackets before each question.
Task 1 - Examine assembly code
Open a terminal within VS Code and start GDB on the appsec_lab
compiled binary ($ gdb appsec_lab
). While attempting each question, we encourage you to cross-examine the given C file with the corresponding assembly generated by GDB’s disas
command to better appreciate how assembly works.
Using GDB’s disas
(shorthand for disassemble
) command, answer the following questions regarding the assembly code:
[1] Where in memory is the line of assembly code that makes a call to foo()
? Record the address on line 1 of submit.txt
. 16 pts
[2] Where in memory does the function foo()
begin? Record the address on line 2. 16 pts
[3] At the top of the C code, we imported various libraries. The methods provided by these libraries appear within our compiled binary too. Where in memory does the standard library function printf()
exist? Record the address on line 3. 16 pts
Takeaway: Be mindful of the difference between the address of the line that calls a function versus the address of the function itself.
Task 2 - Peer into stack during execution
The Application Security Project will require extensive use of GDB’s x/
command to look at stack contents at various stages of execution. First, you will need to set a breakpoint at your desired location, then run the binary. Assuming you have opened GDB with the compiled appsec_lab
binary ($ gdb appsec_lab
), use the following commands:
(gdb) break [address/function]
(gdb) run
The break
parameter can be either an explicit address (such as one of your submissions in the previous task in the form *0x################
) or the name of a function.
Refer to the GDB reference sheet (https://users.ece.utexas.edu/~adnan/gdb-refcard.pdf) as well as the lab slides for various ways to print the stack using x/
[4] Set a breakpoint at foo
. Run the code. At the start of the foo
function, what is the hexadecimal value of the byte stored at $rbp + 8
? Record the byte value in the form 0x##
on line 4 of submit.txt
. 16 pts
Continue to the end of execution
(gdb) c
(gdb) delete
[5] Disassemble foo
. Look for the line that calls printf
and put a breakpoint at that instruction. Note that the breakpoint should be in foo, not in printf. Run the code, then examine memory using the x
command at $rsp + n
where n
is replaced with an integer of your choice (start with 0 if you’re unsure). Locate where the word “THIS” is stored in memory. Change the value of n
so that the memory dump begins exactly at the captial “T”. What value of n
makes this happen? Hint: The ASCII hex value corresponding to “T” is 0x54. Record the decimal value in the form n
on line 5 of submit.txt
.16 pts
Takeaway: The x/
command is extremely versatile. Make use of its features when constructing and debugging an attack.
Task 3 - Endianness
Keeping track of big versus little endian can get confusing when examining the stack. The final portion of this lab will familiarize you with the output style from different GBD x/
commands regarding endianness. Note that x86_64 uses little-endian byte ordering.
Continue to the end of previous execution and remove breakpoints again.
The function foo
multiplies the variable num
by 0x16b28f9
. Disassemble foo and see if you can find the instruction that performs this multiplication. The next instruction moves the result of this calculation back into memory. Set a breakpoint right after it is moved into memory. Now run the code, then examine the stack in the following two ways:
(gdb) x/1gx $rsp
(gdb) x/8bx $rsp
In both cases, we are viewing the same bytes, but you’ll realize they appear to be in a different order. Printing the bytes in giant word format clumps every 8 bytes together, interprets them as a little-endian integer, and displays them in the way we’re used to reading hexadecimal numbers (big-endian). Printing the individual bytes displays how the memory is actually configured (little-endian in our case for x86_64).
Examine the disassembly of foo
to figure out where in memory it keeps the value of num
. Try reading the memory there instead of at $rsp
.
[6] What is the value of num written in big-endian? Record your answer on line 6 in the form 0x################
. 10 pts
[7] What are the values of the raw bytes that represent num in little-endian? Record your answer on line 7 in the form 0x##, 0x##, 0x##, 0x##, 0x##, 0x##, 0x##, 0x##
. 10 pts
Takeaway: While giant word is more concise, when in doubt use individual byte format printing as it shows what the stack space actually looks like.
Submission
Submit the following file to the Autograder by the deadline: Monday, August 26 at 11:59 p.m
submit.txt
The file should contain the answers to each of the questions in a separate line. Note that hexadecimal numbers need be in lower-case only and must start with 0x
.
The following is a sample submit.txt
with the required formatting for answers:
0x0000000000401abc
0x0000000000401xyz
0x000000000040mn89
0x62
16
0xabcdefghijklmnop
0xop, 0xmn, 0xkl, 0xij, 0xgh, 0xef, 0xcd, 0xab
If your submit.txt
is in the VM and you’re not sure how to upload it to the autograder, follow these steps here.