Online services play a major role in our everyday lives for communication, entertainment, socializing, e-commerce, etc. These services run inside datacenters under strict tail-latency service level objectives. In this talk, I will discuss the different types of queueing inside a modern datacenter and focus on how to deal with queueing on a single server by design domain-specific operating systems for latency-critical datacenter applications. I will explain the operational domains, the trade-offs involved, and how queuing theory drives such efforts.
I will start from ZygOS (SOSP2017), which was the first operating system to specifically optimise for microsecond-scale tail latency. I will talk about Persephone (SOSP2021) which targets applications with extreme service time variability by leveraging application-level knowledge. Finally, I will introduce our work on Concord (SOSP2023) which depends on a scheduling mechanism that codesigns the runtime system with a compiler pass to eliminate the need for expensive hardware interrupts, while still serving applications with high service time variability.
Marios Kogias is an Assistant Professor at Imperial College London. His research focuses on operating systems and networking for the modern datacenter. His PhD thesis received the Dennis M. Ritchie Doctoral Dissertation Award, the honourable mention for the Roger Needham PhD award, and the ABB Dissertation Award. He was an IBM PhD Fellow and won the best student paper award at Eurosys 2020. He has also held positions at Cern, Google, and Microsoft Research.