Projects

Recent times have seen concentrated efforts on using malware behaviors for identifying previously unknown malware and sometimes, also describing the known malware. There however, are several challenges-There is a strong disconnect between researchers on what exactly constitutes a malware behavior? How to study these seemingly infinite number of “behaviors”? How do we benefit from such a study? And so on. The thesis attempts to answer some of these questions.

AV companies get thousands of suspicious android apps every day. They cannot have a human analyst go through each one of them. Also, most of the times, it turns out that hiding behind the curtain of obfuscation, it is essentially the same or some slight variation of previously known malware. This results in large amount of wasted time and resources. Thanks to DroidLegacy, they can weed out apps that share high code similarity (read “belong to same malware family”) as previously analyzed malware, leaving just the new and unseen malware samples for further analysis.

The VILO project is a fast and accurate classifier for binary executables. The primary use for such a tool is to perform familial classification of malware. Say you have a collection of malware that is already partitioned into families. Now, when you get a new unidentified malware, how do you figure out which family it belongs to? You use VILO!

The SysCallic project aims to further the state-of-the-art in dynamic malware analysis through application of concolic execution. Traditionally, dynamic malware analysis has been hindered by the single path problem, i.e., only one of a program’s possible paths is observed when it is executed. We’re overcoming that limitation by applying new techniques that allow the analyst to automatically cause the target program to execute over many or all of its possible paths!

Have you ever wanted to reuse some code you wrote before, but you now only have the binary? Or maybe someone else’s closed source software does exactly the thing you’re trying to do and you’d rather not have to reinvent the wheel. If those sound like familiar problems, then the In Situ Reuse of Logically Extracted Functional Components (ISROLEFC) may be just the thing you need to simplify your life!

Virusbattle is a web service that analyses malware and other binaries with a variety of advanced static and dynamic analyses. It was designed to be scalable and extendable, such that it can scale up to analyze millions of samples with a few or many different types of analysis. A new type of analysis can be added dynamically to extend Virusbattle’s capabilities.

FuncTracker is a system for discovering relationships between malware based on shared code. These relationships are determined based on shared semantics hash in order to efficiently handle today's very large scale of malware: millions of malware and terabytes of data.

Malware (worms, trojans, spyware, etc.) is metamorphic if it changes as it propagates. We are seeking to understand the theoretical basis of metamorphic malware, and the possibilities and limitations for catching them.

We have created a simple Perl-based package for computing the Normalized Compression Distance between two arbitrary files. A program also is available to create a CLUTO-compatible similarity matrix from a list of files.

Developed as part of an undergraduate project by Corey Fournier, the Dynamic Unpacker executes packed programs in a virtual environment and writes the in-memory image of the executable to disk after the program has unpacked itself.

DOC is a static analysis suite that detects obfuscations in executables, particularly procedure call and call-return obfuscations. It uses abstract interpretation (AI) to find instances where explicit call or call-return instructions are not used. A prototype is implemented as an Eclipse plugin for browsing X86 executables.

The C-Right project aims to develop tools to help find and evaluate overlaps and similarities in software, to develop quantitative, repeatable, and testable analyses in this area, and to advance techniques for visualizing and documenting the outcomes so that they are readily understood by legal expert and layman alike.

New malware, for the most part, is constructed from old malware. How has malware evolved to date, and what principles guide future evolution and its characteristics? Can we effectively reconstruct malware evolution histories and family trees, and classify and relate different instances of malware? Our malware evolution project examines these questions.