This talk introduces NetCheck, a tool to diagnose network problems in large and complex applications. NetCheck uses traces from existing blackbox tracing mechanisms, such as strace, to diagnose network problems in real world applications. NetCheck can diagnose faults without any specific information about the underlying network or application. NetCheck does this by (1) totally ordering the distributed set of input traces, and by (2) utilizing a network model to identify points in the totally ordered execution where the traces deviated from the behavior a programmer is likely to expect. The key insight in this work is to perform network problem diagnosis by understanding how the programmer expects the network to operate and look for differences in the observed behavior. Our evaluation demonstrates that NetCheck is able to accurately diagnose failures without relying on any application- or network-specific information. For instance, NetCheck correctly identified the existence of NAT devices, simultaneous network disconnection/reconnection, and platform portability issues. In a more targeted evaluation, we have found thatNetCheck correctly detects over 95% of the network problems reported in popular projects like Python, Apache, and Ruby. When applied to traces of faults observed by a network administrator in a live network, NetCheck identified the primary cause of the fault in 90% of the cases. NetCheck performs diagnosis efficiently and can process a GB-long trace in about 2 minutes.
I will also give an overview of the Computer Science and Engineering department at NYU and discuss opportunities for PhD students, interns, and full time developer positions in New York City.
Justin Cappos is an assistant professor at NYU in the Polytechnic School of Engineering. Justin's research interests generally fall broadly in the area of systems security. He focuses on understanding high-impact, large-scale problems by building and measuring deployed systems. Prof. Cappos did his dissertation work describing flaws in prior Linux package managers and building / deploying a new security model.
His work on software update system security was deployed by the major Linux package managers (e.g. apt, yum, pacman, and YaST) and thus protects most Linux servers. Justin also created the Seattle testbed, a networking testbed with tens of thousands of installs and thousands of users. Due to the practical impact of his research, Prof Cappos was named in 2013 as one of Popular Science's Brilliant 10 scientists under 40.