We are rapidly moving into an era when microprocessors (and SoCs) will have 10s of processors on a single die. In this "many-core" era, we are less concerned with the architecture of individual processors and more concerned with how they are tied together. In particular we are concerned with how on-chip memory is organized to optimize use of the limited off-chip bandwidth and how long off-chip latency is "hidden" from computation. Parallelism can take advantage of the plentiful and inexpensive arithmetic units made possible by modern VLSI technology. However, without locality, bandwidth quickly becomes a bottleneck. Bandwidth, not arithmetic is the critical resource in a modern computing system. Stream programming simplifies the exploitation of both parallelism and locality. A stream program naturally exposes parallelism across stream elements and kernels. Locality is also exposed - both within and between kernels. This talk will discuss exploitation of paralleism and locality with examples drawn from the Imagine and Merrimac projects and from three generations of stream programming systems.
Download the slides for this presentation in PDF format.
About the speaker:
Bill Dally is the Willard R. and Inez Kerr Bell Professor of Engineering and the Chairman of the Department of Computer Science at Stanford University. Bill and his group have developed system architecture, network architecture, signaling, routing, and synchronization technology that can be found in most large parallel computers today. While at Bell Telephone Laboratories Bill contributed to the design of the BELLMAC32 microprocessor and designed the MARS hardware accelerator. At Caltech he designed the MOSSIM Simulation Engine and the Torus Routing Chip which pioneered wormhole routing and virtual-channel flow control. While a Professor of Electrical Engineering and Computer Science at the Massachusetts Institute of Technology his group built the J-Machine and the M-Machine, experimental parallel computer systems that pioneered the separation of mechanisms from programming models and demonstrated very low overhead synchronization and communication mechanisms. At Stanford University his group has developed the Imagine processor, which introduced the concepts of stream processing and partitioned register organizations. Bill has worked with Cray Research and Intel to incorporate many of these innovations in commercial parallel computers, with Avici Systems to incorporate this technology into Internet routers, co-founded Velio Communications to commercialize high-speed signaling technology, and co-founded Stream Processors, Inc. to commercialize stream processor technology. He is a Fellow of the IEEE, a Fellow of the ACM and has received numerous honors including the IEEE Seymour Cray Award and the ACM Maurice Wilkes award. He is chairman of Stream Processors and on the board of directors of Portal Player. He currently leads projects on high-speed signaling, computer architecture, network architecture, and programming systems. He has published over 170 papers in these areas and is an author of the textbooks, Digital Systems Engineering and Principles and Practices of Interconnection Networks.