Towards Provable Security of Real-world Servers: Where Online Learning Meets Server Retrofitting


Sponsoring Agency
National Science Foundation


Servers located in enterprises; e.g. private data centers, and public cloud data centers play a critical role in human society. However, real-world servers are plagued by various security vulnerabilities. Among the known vulnerabilities, memory overwrite and over-read vulnerabilities are two most dangerous categories of vulnerabilities. These memory overwrite or over-read vulnerabilities are the root causes for a variety of serious real-world server attacks, including code injection attacks, return-to-Libc attacks, ROP (Return- Oriented Programming) attacks, privilege escalation attacks, data structure manipulation attacks, attacks exploiting the Heartbleed vulnerability, worm attacks, and ransomware attacks. Cyber-defenses are widely deployed to protect real-world servers from these cyberattacks. However, it is widely recognized in the system security community that there is no silver bullet. Further, a very fundamental limitation is that no provable guarantee is provided by any of the widely-deployed real-world defenses.

To bridge the gap, this project will develop a first-of-its-kind co-design framework, which involves three intertwined components: newly synthesized mathematical models, online learning-based defense algorithms and server retrofitting. In particular, the mathematical models are of high-fidelity and also analytically tractable for online learning to provide provable guarantees. On the other hand, the discrepancies of the mathematical models from real-world servers are bridged by server retrofitting. In each proposed mathematical model, a utility function can be easily evaluated by deployed preliminary defenses and provides necessary feedbacks to perform online learning, and on the other hand, it properly reflects the cost-effectiveness of defenses. Online learning algorithms are developed to tackle unique challenges of computer security; e.g., detection delays, detection inaccuracies, strategic attacks, unknown system states and unknown exploit likelihoods. The most suitable server retrofitting is customized to meet the assumptions of the mathematical models. Further, the three intertwined components are integrated into real defenses.

The developed defenses will present adversaries optimized dynamically changing attack surfaces, thereby significantly increasing uncertainty and complexity for the adversaries to succeed. They will significantly improve adaptive and autonomous defense capabilities of real-world servers against zero-day attacks during vulnerability windows. The proposed research is interdisciplinary and integrates technical tools from machine learning, game theory, control theory and cybersecurity. This will lead to educational and training opportunities that cross traditional disciplinary boundaries for high school, undergraduate, and graduate students in STEM. Through problem-based learning, a new graduate special topic course will be developed and a new module will be introduced to an undergraduate course on network security. Hackathon events will be held to inspire the students' engagement in research on machine learning and cybersecurity. All the research results will be made available to industrial stakeholders, federal and defense departments and the research community.

Research Area