Project 2: Malware AnalysisSpring 2023

due Tuesday, February 21 at 11:59 p.m.

Project Learning Goals

Familiarize you with the types of behaviors that can be displayed by real-world malware samples and how to safely analyze these behaviors using JoeSandboxCloud.
- Joe Sandbox detects and analyzes potential malicious files and URLs on Windows, Android, Mac OS, Linux, and iOS for suspicious activities. It performs deep malware analysis and generates comprehensive and detailed analysis reports.

Setup and Resources

We have provided 5 malware sample reports from Joe Sandbox. View them here:
The F.A.Q. thread on Piazza will be constantly updated. Therefore, before asking anything, go through it. If your question is not covered, feel free to post it as a reply in the same thread or make a new post.
Refer to this video for an overview of the project and the recommended approach to solving some of the questions given below - https://youtu.be/FREHIykOon0

Deliverables & Grading

Your deliverable for this project will be completing the Canvas quiz named Project 2 – Malware Analysis with your answers to the 25 questions below. You will have unlimited attempts to submit your answers and the latest score will be your final grade for this project.
Your score will not be displayed until after the submission deadline.
Each question is worth 4pts for a total of 100pts. You will only receive full points for a question if all the options you select are correct. You will get partial points for marking a subset of the correct answers and negative points for every incorrect option you select. You will not receive any points if you don’t attempt the question.

Analyze Your Malware Samples

You will investigate and label some of the more sophisticated malware behaviors from the five malware samples we provided. Use the included JoeSandbox reports to identify the malware’s behavior. Note that malware samples can share behaviors. So initially you should assume that each malware we question you about below has every behavior listed. It’s your job to determine if that assumption is actually true.

Hint: Look at the API/system call sequence under each process generated by the malware sample and determine what the malware is doing. Note that each JoeSandbox report may contain multiple processes with many different system call sequences. If any of the behaviors are seen (or attempted, but not necessarily successful) in any process in the report, then that malware has attempted that behavior. This is, of course, not completely practical, as legitimate applications may perform the same actions in a benign fashion. We are not concerned with differentiating the two in this assignment, but it is some food for thought.

Clarification for attempted: We mean by “attempted” that a specific action was attempted but failed. By “specific” we mean that it is clear which action is attempted. If you have a registry key, for instance, that is unambiguous (like, say, it is used only to set a startup option), but it fails to change the key, that is an attempt for our purposes. But if you have a more generic registry key that governs multiple settings, we don’t know for sure which key or keys it is attacking and so the action would not count as an “attempt”.

Further, you will encounter that the same API functions can end with either a W or an A. This is a standard practice in the Windows API, and this document explains the difference (either one could in theory be present in the wild): Unicode in the Windows API

Questions

For each of the following questions in Canvas, mark which of the malware exhibit the identified behavior:

Malware uses a packed executable

Hint: What are the different types of packers? - Reverse Engineering Stack Exchange
Drops potentially malicious file(s)

Hint: Look at the “Created/dropped Files” section of the malware reports
Microsoft Office registry key deletion

Hint: Look for registry key activity in the “System Behavior” section of the malware reports
Microsoft Excel registry key creation

Hint: Same as (3)
Creates any kind of registry values

Hint: Same as (3) & (4)
Drops RegAsm virus
Issues signal to cause immediate program termination

Hint: Linux Signals Help
Malicious file programmed in C or C++

Hints:

(A) Look at the “Behavior Graph” of the malware report and explore each of the files executed.

(B) For this question, ensure to only consider files that are malicious and not system/application files that are used to launch other malicious files
Drops files related to the Mirai botnet
Keylogger attempt

Hint: Examine the various signature sections of the malware reports
Attempts to copy clipboard

Hint: Same as (10)
Hooks registry keys/values to protect autostart

Hint: Check the “Joe Sandbox Signatures” section of the malware reports
Possible PFW / HIPS evasion

Hint: Same as (12)
Uses the Windows core system file splwow64.exe
Drops a portable executable file into C:\Windows

Hint: Examine the “Joe Sandbox Signatures” and the “Created/dropped Files” sections of the malware reports
Looks for the device’s GUID

Hint: Same as (12)
HTTP GET or POST without a user agent

Hint: Look at the “Network Behavior” section of the malware reports
Uses techniques to detect sandboxes, dynamic analysis tools, or VMs.

Hint: Same as (12)
Attempts to override the domain name system (DNS) for a domain on a specific machine.

Hint: Microsoft TCP/IP Host Name Resolution Order
Possible system shutdown

Hint: Same as (12)
Checks if Antivirus is installed on the system

Hint: Same as (12)
Communicates with external hosts using a Dynamic DNS service

Hint: Look up what Dynamic DNS services are and examine the “Domains/IPs” section of the malware reports
Attempts to send mail

Hint: Same as (17)
Drops encrypted files on to the host

Hint: Same as (15)
Makes use of Microsoft Office products to infect hosts

Reflection

Now that you have some experience analyzing malware, take a moment to read this brief reflection. For this project you used analysis tools that do the analysis for you. In practice, entire teams of people are devoted to work on a single malware executable at a time to debug it, disassemble it and study its binary, perform static analysis techniques, dynamic analysis techniques, and other techniques to thoroughly understand what the malware is doing. Luckily for you, it takes an enormous amount of time to perfect/improve the skills of malware analysis, so we don’t require it for this project. However, to give you a scale of how much work this all takes, consider that antivirus companies receive somewhere on the order of 250,000 samples of (possible) malware every day. We had you analyze 5 binaries. Imagine the types of systems needed to handle this amount of malware and study them thoroughly enough for that day, because the next day they’re going to receive 250,000 new samples. If a malware analysis engine is unable to analyze a piece of malware within a day, they’ve already lost to malware authors. Also consider that not all of the 250,000 samples will be malicious. According to “Prudent Practices for Designing Malware Experiments: Status Quo and Outlook”(Rossow et al., 2012), as many as 3-30% may be benign!

Another way to look at the size issue of malware analysis, consider this paper “Needles in a Haystack: Mining Information from Public Dynamic Analysis Sandboxes for Malware Intelligence.”(Graziano et al., 2015) where the authors discovered that notorious malware samples had actually been submitted months, even years before the malware was detected and classified as malicious in the wild.

Remember, analyzing malware is a delicate and potentially dangerous act. Please be cautious and use good practices when analyzing malware in the future. If you let malware run for too long, you may be contributing to the problem and may be contacted by the FBI (and/or other authorities) as a result of this unintentional malicious contribution. At Georgia Tech, researchers, professors, and graduate students are able to analyze malware in controlled environments and have been given permission by the research community to perform these analyses long-term. We make efforts to contact the general research community and Georgia Tech’s OIT Department to inform them that we are running malware, so they won’t raise red flags if they detect malicious activity coming out of our analysis servers.

For your Curious Mind

There is disagreement in the malware research community as to what exactly classifies malicious activity. For example, some say that adware is a form of malware, while others do not. Can you think of arguments for either side? Let’s take this kind of thinking one step further. As a thought experiment, ask yourself this: If a piece of software has malicious code contained within it, but the malicious code is never executed when it is run, is/should that software be considered malicious? What if the malware author intentionally put in a buffer overflow vulnerability that allows someone to execute that malicious code? So, the only way of knowing the code can be executed is to exploit the malware. This seems like it would be a much more advanced form of trigger malware doesn’t it? Think of other tricks malware authors may employ to prevent researchers from discovering a malware’s true intentions.

Be careful if you ever get your hands-on malware source code. We always make sure we read and fully understand malware source code before we compile and run it. Remember, safety is the number one priority in malware analysis.

If you’re interested in reading more information about researching malware, we recommend you read “The Art of Computer Virus Research and Defense” by Peter Szor. It’s known in the research community as a must-read for those interested in studying malware.

References

Rossow, C., Dietrich, C. J., Grier, C., Kreibich, C., Paxson, V., Pohlmann, N., . . . Steen, M. V. (2012). Prudent practices for designing malware experiments: Status quo and outlook. 2012 IEEE Symposium on Security and Privacy. doi:10.1109/sp.2012.14 http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6234405

Graziano, M., Canali, D., Bilge, L., Lanzi, A. & Balzarotti, D. (2015). Needles in a Haystack: Mining Information from Public Dynamic Analysis Sandboxes for Malware Intelligence.. In J. Jung & T. Holz (eds.), USENIX Security Symposium (p./pp. 1057-1072), : USENIX Association. https://www.usenix.org/system/files/conference/usenixsecurity15/sec15-paper-graziano.pdf