Detecting Non-Adversarial Backdoors
Backdoors bypass regular authentication and pose a big security threat: 185 CVEs are related to backdoors. Developers sometimes intentionally create non-adversarial backdoors (NABs) to facilitate development or maintenance tasks, but forget to remove them before they slip into deployed systems. Attackers exploit these NABs to gain unauthorised access.
Because NAB authors are not trying to hide them, NABs usually have characteristics that we can leverage to build effective detectors. For example, an NAB often listens for socket connection and uses the fork-exec model. An NAB may also leave footprints in the natural language channel of code, such as variable names and comments. This project aims to utilitse information in these channels to automatically detect NABs.
Bimodal Static Taint Analysis
Taint analysis is a common technique used to track unverified, external, data in a program during execution. This technique may neglect sources of information within a program that might get accidentally leaked. Sensitive variables that should not be revealed will often have a bimodal footprint: a variable called "secret_value" should be tracked internally, automatically. The variable's natural language name reveals its nature. This project proposed tracking how the variable name itself flows into other variables, allowing us to statically track accidental exfiltration.
Secure Multi-Execution Fuzzing
Secure multi execution (SME) executes a program multiple times, one for each given security level, and carefully compares outputs over each run. As this is done at runtime, every execution enforces secure information flow. This project proposes using a fuzzer to simulate the runtime environment in order to provide a probability that a program is noninterferent. This can assist a developer in knowing how secure or insecure their code is, and whether the more costly requirements of runtime SME, or some other technique, are required.
Do comments help hackers?
Do comments in code help hackers to craft custom attacks? We propose a study where commented code is examined to look for todos, and known bugs, which could help an attacker do nefarious things.
Type Reconstruction for Binaries
Reverse engineering translates a binary into human-readable code. It is key to understanding malware, but it is hard: compilation is lossy. A subproblem in reverse engineering is type reconstruction, the problem of determining the type of a location during execution. This project will use machine learning to tackle the type reconstruction problem. It will modify a compiler to tag locations with names and types and use name-tagged and type-tagged execution traces as signal to train a neural network .
RefiNym learns type refinements for C#. RN++ would do the same for C/C++ or for TypeScript.
Test Case Intent: When a developer adds a test case to a test suite, they did so for a reason. Which code is a test case (an input oracle pair) intended to test? And is that test case positive or negative: that is, is it intended to validate whether a correct behaviour exists or that an incorrect behaviour does not? This project seeks to recover that intent. One goal would be to build a tool to automatically classify tests into positive and negative.
Finding the Core
This project aims to build a tool that separates a program's source code into its core logic and ancillary logic for handling corner cases and errors. The tool would be useful in code review, in focusing static analysis, as new kind of test coverage measure: core logic covered. The first step would an empirical study to learn and harvest features that separate the kinds of code: is there enough signal for humans to separate the two? The next problem will be establishing the ground truth. The final task would be to build a classifier, possibly using deep learning. This is an ambitious project; a distinction thesis would only need to make progress on it.
Extending Ariadne
Ariadne is a tool that automatically finds inputs to a program that trigger floating point exceptions. To this end, it first replaces floating point types to arbitrary precision rational, another real number encoding. Then it injects checks for potential floating point exceptions, symbolically executes the instrumented program, and uses to the underlying SMT solver to find inputs that do not pass the injected checks.
This project aims to extend Aridane to detect general numeric errors in important applications, such as physics simulators and game engines, using a similar idea.
Write a tool that overlays a GUI. For a web app, the tool could a browser extension. This tool allows an enduser to capture, the steps and their locations in a GUI, like a web app, that trigger a bug. This tackles the problem of field bug reproduction.
Test Cases as Coin Tosses: Infer an input probability distribution from test suite. Does a test suite just define a finite support?
Recursive Git Blame: git blame stops at the last write to a line. The project would implement recursive git blame, which would use a similarity measure to chase writes until no line is found within a specified similarity. One challenge will be when more than one line exceeds the similarity threshold.
Functionalizer: takes a code snippet and turns it into a function.
Idempofier: Extract functions into a test harness that makes their execution Idempotent, like system calls, then loop them to learn a function summary. Use case is concretisation in symbolic execution.
Build a test harness to discover, with high probability, loop-carried dependencies,
Build an automocker, a tool that takes a function signature and produces a mock object for it.
GitHub Copilot vs Stack Overflow: Who has the better code snippets?
GitHub Copilot, the artificial intelligence tool developed by GitHub and OpenAI to assist users by autocompleting code, has the potential to revolutionise how we program, and, in particular, how we obtain code snippets for our programming tasks. Before Copilot, the question-and-answer website Stack Overflow was often the source for such snippets. But which source provides the "better" code? The vast majority of the code produced by Copilot is uniquely generated and has never been seen before, whereas snippets on Stack Overflow have been curated by a community of millions of users. Starting from a dataset of queries to the Stack Overflow search engine, the goal of this project is to compare the code that we would find on Stack Overflow and the code that Copilot would generate for us. Christoph Treude of the University of Melbourne will help me supervise this project.
Study the libreOffice take over and the SSL take-over. https://people.gnome.org/~michael/blog/2015-08-05-under-the-hood-5-0.html
Identify a program's core logic: See above.
Life cycle/expectancy of code snippets: Vary snippet, aka organism, size from subline to file.
Github project viability/predict popularity: What is project popularity?
Can we use information theory to separate random and benchmark SAT formulae?