Recent technology scaling has led to the realization that communication, and not computation, dominates energy costs. This realization, coupled with the constant increase of parallelism and the fact that power consumption is typically the primary design constraint, results in increased difficulty in providing sufficient communication bandwidth to keep processors busy. Power is a critical challenge for HPC, datacenters and consumer electronics. In HPC, a 1000x improvement in performance is needed with only a 10x increase in power by 2018. Moreover, datacenters require $7B just for cooling in the USA, which is projected to increase by 4x in the near future. Finally, consumer electronics require a 2x increase in performance with no increase in power every two years to remain competitive. In this talk, I will present my recent work on efficient data movement on and off chip, as well as efficient DRAM access. I will focus on collective memory transfers, which maximize DRAM performance and minimize power by guaranteeing in-order access patterns from a collection of processors to the memory. I will also present the channel reservation protocol, which eliminates congestion in system-wide networks caused by adversarial or unbalanced traffic in order to increase throughput and reduce latency for benign traffic, and therefore increase the utilization of costly network bandwidth. I will conclude this talk with an overview of related projects and ideas for the future.
George Michelogiannakis is currently a postdoctoral research fellow at the Lawrence Berkeley National Laboratory. He is part of the computer architecture laboratory which examines key computer architecture research challenges both on and off chip. He completed his PhD at Stanford University in 2012 with Prof. William J. Dally. His past work focuses on on-chip network with numerous contributions to flow control, congestion, allocation, and co-design with chip multiprocessors. His other work includes congestion control for system-wide networks, precision loss avoidance for system-wide reduction operations, and maximizing DRAM efficiency by taking advantage of advanced language constructs. George Michelogiannakis was the recipient of the Stanford Graduate Fellowship, and numerous other awards during his studies.