Lab 1: GDBFall 2023

This lab will introduce you to memory layout and GDB concepts that are critical for performing buffer-overflow attacks in the Application Security Project. You will not be actively attacking code in this lab, but instead dissecting an innocent C file to understand how it appears in memory when compiled and during execution.

Setup

Memory layout and buffer-overflow exploitation depends on details of the target system. For this lab and the corresponding project, you must create your solutions inside the Application Security Project VM, as it has been configured to standardize the stack layout and disable certain security features that would complicate your work.

  • Follow the setup instructions on the Application Security Project VM page. You only need to do this once; if you follow the guide for this lab, you can use the same VM and workflow for the project.

  • Create a new private repository from the the GitHub template. Clone your repository from GitHub inside the VM. You must do this in a folder in the native Linux filesystem. It won’t work correctly if you use a shared folder located in the host OS.

If you’re new to using Git/GitHub, check out this guide to help you get started.

Resources

Primer Videos

Before you begin the lab, ensure that you watch the following videos, parts 1 and 3 in particular. They contain a few tips to help you along the lab. They were designed with the Application Security Project in mind but you will find that the videos are useful here too.

You can find the videos’ slides here: https://files.gtinfosec.org/appsec_primer.pdf (Updated 8/24)

GDB

You will make use of the GDB debugger for dynamic analysis within the VM. Useful commands that you may not know are disassemble, info reg, x, and stepi. See the GDB help for details, and don’t be afraid to experiment. This quick reference may also be useful: https://users.ece.utexas.edu/~adnan/gdb-refcard.pdf.

x86 Assembly

There are many good references for Intel’s assembly language, but note that our project targets use the 32-bit x86 ISA. The stack is organized differently in x86 and x64. If you are reading any online documentation, ensure that it is based on the x86 architecture, not x64.

Big versus Little Endian

The final task in this lab will involve endianness. Refer to this guide if you are unfamiliar with endianness or need a refresher. Also, there are helpful images you can find via Google that visually diagram this concept.

Tasks

You will write all answers in one file (submit.txt). The line number for each task’s responses is indicated in bolded brackets before each question.

Task 1 - Examine assembly code

Open a terminal within VS Code and start GDB on the appsec_lab compiled binary ($ gdb appsec_lab). While attempting each question, we encourage you to cross-examine the given C file with the corresponding assembly generated by GDB’s disas command to better appreciate how assembly works.

Using GDB’s disas (shorthand for disassemble) command, answer the following questions regarding the assembly code:

[1] Where in memory is the line of assembly code that makes a call to foo()? Record the address on line 1 of submit.txt. 16 pts

After disassembling certain functions, you may see calls to cryptic functions named __x86.get_pc_thunk.bx or similar. These calls are inserted by the compiler to ensure that the binary is position independent. We leave it as an exercise to the reader to understand the meaning of these calls. However, they are not relevant to this lab or the project and can be safely ignored.

[2] Where in memory does the function foo() begin? Record the address on line 2. 16 pts

[3] At the top of the C code, we imported various libraries. The methods provided by these libraries appear within our compiled binary too. Where in memory does the standard library function printf() exist? Record the address on line 3. 16 pts

Takeaway: Be mindful of the difference between the address of the line that calls a function versus the address of the function itself.

Task 2 - Peer into stack during execution

The Application Security Project will require extensive use of GDB’s x/ command to look at stack contents at various stages of execution. First, you will need to set a breakpoint at your desired location, then run the binary. Assuming you have opened GDB with the compiled appsec_lab binary ($ gdb appsec_lab), use the following commands:

(gdb) break [address/function]
(gdb) run

The break parameter can be either an explicit address (such as one of your submissions in the previous task in the form *0x########) or the name of a function.

Refer to the GDB reference sheet (https://users.ece.utexas.edu/~adnan/gdb-refcard.pdf) as well as the lab slides for various ways to print the stack using x/

[4] Set a breakpoint at foo. Run the code. At the start of the foo function, what is the hexadecimal value of the byte stored at 0xfff6ffad? Record the byte value in the form 0x## on line 4 of submit.txt. 16 pts

Continue to the end of execution

(gdb) c
Remove your previously set breakpoint

(gdb) delete

[5] Set a breakpoint at *0x08049bf5 and run the code. Note the string that is initialized in line 7 of the C code. Locate the address on the stack where the word “THIS” begins. That is, where the capital “T” is stored. Record the address on line 5. Hint: The ASCII hex value corresponding to “!” is 0x21 and “T” is 0x54. 16 pts

Takeaway: The x/ command is extremely versatile. Make use of its features when constructing and debugging an attack.

Task 3 - Endianness

Keeping track of big versus little endian can get confusing when examining the stack. The final portion of this lab will familiarize you with the output style from different GBD x/ commands regarding endianness. Note that x86 uses little-endian byte ordering.

Continue to the end of previous execution and remove breakpoints again.

Set a breakpoint at foo. Run the code, then examine the stack below the base pointer in the following two ways:

(gdb) x/16bx $ebp
(gdb) x/4wx $ebp

In both cases, we are viewing the same bytes, but you’ll realize they appear to be in a different order. Printing the individual bytes displays how the memory is actually configured (little-endian in our case for x86). Printing the bytes in word form clumps every 4 bytes together, interprets them as a little-endian integer, and displays them as a big-endian word (as numbers are naturally represented).

Using your knowledge of stack convention when a function is called, as well as the output to the previous x/ commands displaying the stack contents beneath the base pointer, answer the following:

[6] What is the value of the input parameter to foo() written in big-endian? Record your answer on line 6 in the form 0x########. 10 pts

[7] What is the value of the input parameter to foo() written in little-endian? Record your answer on line 7 in the form 0x########. 10 pts

Takeaway: While word-style is more concise, when in doubt use individual byte-style printing as it shows what the stack space actually looks like.

Submission

Submit the following file to the Autograder by the deadline:

  • submit.txt

The file should contain the answers to each of the questions in a separate line. Note that hexadecimal numbers need be in lower-case only and must start with 0x. The following is a sample submit.txt with the required formatting for answers:

0x080abcde
0x08012345
0x08099999
0x08
0x08012344
0xabcdef12
0x12abcdef
The first line is the answer to the first question, the second line to the second and so on.

If your submit.txt is in the VM and you’re not sure how to upload it to the autograder, follow these steps here.