Project 5: ForensicsFall 2024

This project counts for 9% of your course grade. Late submissions will be penalized by 15% of the maximum attainable score. If you or your partner (if you’re working in a team) have a conflict due to travel, interviews, etc., please plan accordingly and turn in your project early.

This is a group project; it is highly recommended you work in teams of two and submit one project per team.

Strict no-leaks policy. In this project, you play the role of a computer forensic analyst working to solve a case. Since you don’t want to be fired for jeopardizing an ongoing criminal investigation, you need to follow a strict policy on collaboration. You are bound by the Academic Honor Code not to communicate with anyone regarding any aspect of the case or your investigation (other than within your team or with course staff). This includes making public posts on Ed that may help other students in any manner. The number of pieces of evidence you find, the techniques you try, how successful said techniques are, the general process you follow, etc. are all considered part of your solution and must not be discussed with members of other teams.

Start early. It may be impossible to complete this project before the deadline unless you start early enough. Note that there are apsects of this project like password cracking that take processing time, and hints which we will take time to respond to. Please plan accordingly.

🔍 Read this document in its entirety. There is a lot of information in this document. Reading it thoruoughly now will save you time and effort later.

Solutions must be submitted via the Autograder and Gradescope, following the submission details at the end of this spec.

Setup

Similar to other projects of this course, you will be using Docker to complete part of this project. However, if you haven’t already, follow our Docker guide to learn how to set up Docker on your computer.

  1. To get the code for this project, create a repo using the GitHub template. Make sure to make this repo private.
  2. Clone the repo onto your system (you’ll need to supply your GT credentials for this), then open it in VS Code.
  3. If you successfully set up Docker in the previous project/labs, you should be greeted with a pop-up in the bottom right asking you to re-open the directory in the development container; do so now!
  4. After some time taken to build the container, you should be greeted with the project files in a directory and a terminal connected to the container (as shown in the Docker guide).
  5. Make sure that other Docker containers are not running before you start the project. This container must secure port 3235 for Autopsy and 3236 for Wireshark.

If you’re unfamiliar with using Git/GitHub, check out this guide to help you get started.


Introduction

In this project, you will play the role of a forensic analyst and investigate the theft of company secrets from Sketchy Computer People Inc. (SCP Inc.) Their staff became aware of the theft after Technique ran a story regarding one of their closely guarded secrets.

The case went cold when the leading suspect, George P. Burdell, fled the country and disappeared. Officers seized their computer, but the hard disk was encrypted and investigators were unable to crack the password. No further evidence could be found. The only other possible lead is George’s Twitter account @gpb_tweets.

Investigators just recently caught a break when they found the hard disk encryption password on a sticky note in George’s home office. They’ve decrypted the device and made it available for your analysis.

Your job is to conduct a forensic examination of the disk image and document any evidence related to the crime. If you find sufficient evidence, a case can be brought against George.

Objectives

  • Understand how computer use can leave persistent traces and why such evidence is often difficult to remove or conceal.
  • Gain experience in using forensic techniques to investigate computer misuse and intrusion.
  • Learn how to retrieve information from a disk image without booting the operating system, and understand why this is necessary to preserve forensic integrity.

Getting Started

The tools and techniques you use for your investigation are up to you, but here are some suggestions.

General Knowledge

A working knowledge of Linux is helpful for this project. If you don’t have this yet, you may need to spend time Googling and/or experimenting to get up to speed. The course staff will also answer general Linux questions as a last resort. For an excellent reference book, try UNIX and Linux System Administration Handbook by Nemeth, Snyder, Hein, and Whaley. Also, see https://en.wikipedia.org/wiki/Disk_partitioning for some additional background.

Useful Tools

Here are a list of tools that HQ believes might be useful in your investigation:

  • Wireshark: just like the Networking Project, available in your development container (http://localhost:3236)
  • Autopsy: just like the Lab for this project, available in your devcontainer (http://localhost:3235)
  • Audacity: your one-stop shop for analyzing audio files
  • Tor Browser: used for all your shady anonymous browsing needs
  • John the Ripper: available in your devcontainer, see the Password Cracking section below
  • Steghide: available in your devcontainer, a popular Steganography program
  • Ghidra: Useful for reverse engineering suspicious binaries you may find

Analysis

Note that the files linked below are quite large and may take a while to download. Additionally, if you only have 8 GB of RAM, you may need to close other applications to run these images. You will also require about 20 GB of free space to extract the images.

Dead Analysis

In dead analysis, the forensic investigator examines data artifacts from a target system without the system running. We suggest trying dead analysis with the Autopsy open-source forensics tool, which we ship as a Docker image. We have already performed the intensive disk image ingest process using George’s drive, and have provided an Autopsy case which has the analysis available to you to explore.

Running Autopsy in Docker
  1. Download the Autopsy case: george-drive.tar.xz.

  2. Place this file in the root of your project directory (i.e. on the same level as evidence and tokens.txt).

  3. Decompress the case directory: tar -xJf george-drive.tar.xz.

    Make sure to decompress the case file in your host, rather than in the development container, as copying files into the development container appears to happen instantaneously but actually takes more time in the background, often causing issues related to decompressing the file while it is still being copied.

  4. Open your project directory in VS Code, then reopen the directory in the development container. See the Docker guide for more information.

  5. Once the container has booted, navigate to http://localhost:3235 in your web browser. After clicking “OK” to the first pop-up, you may be greeted with an empty gray window for some time. It is loading behind the scenes; after a minute or so, you should see the Autopsy home screen pop up.

  6. Select “Open Case”, then navigate to /workspaces/forensics/george-drive and open george-drive.aut.

  7. After the case has been opened, the tree on the left gives you various ways of examining the data. Try expanding “Data Sources” to view the partitions and file system. You can also try running a keyword search using the button in the upper right corner of the window. Although, given the size of the disk image, this may take a while and may not work if you try anything apart from the “Exact Match” option.

  8. In addition to hints dropped elsewhere, here is an incomplete list of things to try:

    • Examine the system logs.
    • Check for deleted or encrypted files.
    • Search for strings that may indicate relevance to your investigation.

Live Analysis

Live analysis is a forensic technique in which the investigator examines a running copy of the target system.

HQ has used the disk image of George’s physical computer to produce virtual machine images that might be useful in getting a more holistic view of their activities than can be easily provided by dead analysis (e.g. easily interacting with graphical programs, seeing positioning of files and applications around their computer, etc.). They have provided virtual machine images for:

HQ heard that you set up a virtual machine in your previous project and thinks that referring back to these skills might be useful here. We distribute two versions of the UTM image, one for Intel and one for Apple Silicon Macs for performance reasons (virtualization is much faster than emulation!).

From first booting up these images, HQ observed that George may have instituted some measures to prevent their computer from being booted by others, but hopes that you will be able to circumvent these.

Password Cracking

Password crackers may be helpful in trying to brute-force decrypt password-protected files. John the Ripper is the canonical Unix password cracker. Because HQ is confident that John will be useful in your investigation from a cursory investigation of George’s drive, they dedicated time to providing you with it via the project’s development container (among other tools). HQ, however, does not know what lies ahead and what might be useful as you uncover more; you will be responsible for discovering, obtaining and successfully using tools in other domains throughout the course of your investigation.

When using a password cracker, it is wise to first make sure that the password is not susceptible to a dictionary attack (and you can easily find very large wordlists online!) and does not use a restricted character set (e.g., lowercase letters, letters only, letters and numbers only) before spending time on a full brute-force crack. It is also a good idea to crack a very vulnerable password first to make sure you are using the tool correctly.

Like any intensive calculation, password cracking takes time. It may be impossible to dedicate the time required to it if you do not start early enough. Please plan ahead.


Tasks and Deliverables

The two main deliverables for this project are a list of all the tokens that you find, and a report where you state your case for either the guilt or the innocence of the defendant. In addition, if you recover files that are relevant to your responses, name them in your report and include them with your submission in a tarball named evidence.tar.gz. (Add evidence to the evidence directory then run make to generate this.)

To get you started, here are three questions to ask yourself as you begin your investigation. You do not necessarily have to answer them in the report, but they can serve as good starting points.

  1. What operating system does the suspect use?

  2. Are there any personal files that the suspect may have left on the machine?

  3. Do there appear to be any suspicious usage patterns that suggest malicious activity?

Be on the lookout for evidence of any other machines or network services or websites that the suspect may have used. These may contain important evidence and raise further questions you’ll need to investigate (hint, hint!).

Before attempting to access any external leads, check with HQ via their self-service portal at scp.gatech.fail/permissions/ or sketchycomputerpeople.com/permissions/. Some tokens and clues may only be found by gaining approval.
Note, websites are not the only external leads you may need to check with HQ about.

If you don’t receive permission, but you still believe a site might be relevant, please email HQ directly at gtinfosec-forensics@cc.gatech.edu. The subject line should begin with “Forensics Project Permission”. Failure to ask permission is guaranteed to be a waste of time and may violate the course ethics policy or result in a grade deduction. You do not need to ask HQ for permission to investigate George’s disk image and social media, since you have been cleared to do so by this writeup.

The Report 16 pts

The report is the focus of your deliverables and will be a substantial portion of your grade. We encourage you to work on the report in parallel to solving the project. The requirements are the following:

  • Verdict 10 pts: your report should specify whether or not George is guilty of a crime. If so, explicitly state the crime. Finally, explain why you picked this verdict, citing evidence that you found during the project.
  • Summary of Tokens 6 pts: list the tokens you have found, along with a short one-sentence explanation per token of how you encountered it.

The length limit for the report is 3 pages. While we don’t expect a legal brief, your report should be clear, professional, and easy to read.

The Token List 84 pts

The purpose of the token list is to allow us to identify exactly what you found in your investigation. As you complete your investigation, you should be on the lookout for tokens in the form #token-<Thisisthetoken># or some slight variation of this syntax. All tokens must be spelled exactly as they appear to get credit. Misspelled tokens will receive no credit, with no possibility of a regrade. Including extraneous content in your list will also result in a deduction.

tokens.txt in your repository should have one token per line, without the surrounding token syntax. Using the same example as above, if you encounter the string #token-<Thisisthetoken># during your investigation, you should only add Thisisthetoken to tokens.txt.

It is your duty to follow all the leads you can find and to conduct a thorough investigation, and your evidence will be crucial for solving this case and putting the right person behind bars. However, once you collect 42 tokens, you will receive full credit on the token portion of the project grade. That is:

  • You will not receive extra credit for retrieving more than 42 tokens.
  • Each token is worth 2 points, up to a maximum of 84.

We expect you to be able to find 42 tokens (totalling to 84 points) but may choose to reduce the required token count depending on class performance. If such a change happens, it will be assessed after the deadline. Put in your best effort and do not rely on this.

The Evidence Folder

The purpose of the evidence folder is for you to collect all evidence you deem important to your investigation’s outcome. We may also reference your evidence folder to understand how your investigation progressed.

There is no strict guideline as to what can belong in the evidence directory; original files or curated screenshots are all acceptable. Filenames and subdirectory structure do not matter.

Run make in the root of your project directory to generate evidence.tar.gz from your evidence directory; you will submit this file to the Autograder.


Policies and Hints

Collaboration: Strictly prohibited outside your team.

You are bound by the Academic Honor Code not to communicate with anyone regarding any aspect of the case or your investigation (other than within your team or with course staff). The number of pieces of evidence you find, the techniques you try, how successful said techniques are, the general process you follow, etc. are considered part of your solution and must not be discussed with people outside your team. If someone brings up the project, close your eyes, plug your ears, drop to the floor, and start yelling “LALALALA” and refer them to your supervisor to get officially spoken to.

If you get stuck… Requesting Hints

Given the nature of this assignment and its strict collaboration policy, HQ recognizes the need for some hints. If your team gets stuck, create a post on Ed under the Hints category. Ensure that your post is private and that you include your partner’s name if you have one. Each team is allotted five hints. Please note that HQ has been known to take up to 24 hours to respond to agents.

Free Hints :)

To help you get started, hints regarding the three questions provided above are free. After that, each team may receive a maximum of five hints, and we will enforce a one-hour delay between hints for each team. Check the megathread on Ed for more information on what counts against your hint limit and what doesn’t.

Finishing the Project

HQ is unable to answer questions regarding whether a report contains all possible findings—they are relying on you to complete the investigation. It is your job as a forensic analyst to draw as complete of a picture as you can. You should submit when you believe you have conducted a thorough investigation and have enough evidence to acquit or convict George.


Troubleshooting

Autopsy Freezing

Autopsy is a heavy and complex application, and components of it may freeze sometimes. If this happens, restart the container by closing your project directory in VS Code and reopening it.

You can also try to restart the Autopsy container directly, using the following steps:

For Docker Desktop (Windows and macOS)
  1. Navigate to the Containers tab, and find the container for this assignment (usually, it should be the only one running).
  2. For this container, click on the three dots under the Actions column.
  3. Click the Restart button.
For Linux hosts on Docker Engine
  1. In a terminal window, run docker ps to see all running containers. This assignment’s container is usually the only one running.
  2. Note the container ID of this container (first column of output).
  3. Run: docker restart <container ID>

Submission Details

  1. Create a repo using this GitHub template. Make sure that the repo that you create is private.

  2. Establish a team on the autograder. Only teams created on the autograder will be able to utilize HQ’s self-service portal.

    Forensics Project Autograder

  3. After submitting your tokens and evidence to the Autograder, you will need to submit your report to Gradescope. The report should be in PDF format. Try to limit the number of pages to 3 or fewer.

Make sure you have completed the following items by the deadline:

  • Upload your completed report to Gradescope, and ensure your partner’s name is added to the submission. Only one submission per team is required.

  • Submit the following files to the Autograder:
    tokens.txt: A plaintext file with all your tokens separated by newlines.
    evidence.tar.gz: A tarball of your directory containing recovered files that support your findings.