The 2013 Pab Lab Boot Camp – Short Course on Parallel Programming is intended to offer programmers a practical introduction to parallel programming techniques and tools on current parallel computers, emphasizing multicore and manycore computers.
Abstract: To help parallel computing become mainstream, one of the main design considerations for multicore architectures should be support for programming productivity. This means designs that can deliver high performance and efficiency while relieving programmers and compiler writers from managing low-level tasks, and designs that help minimize the chance of parallel programming errors.
In this talk, I will present an overview of the ideas behind the Scalable and Flexible Bulk Architecture. The architecture has a set of novel techniques for programmability, while retaining scalability and flexibility. In particular, I will present Volition, the first hardware scheme that detects Sequential Consistency Violations (SCVs) in a relaxed-consistency machine precisely, in a scalable manner, and for an arbitrary number of processors in the cycle. Volition enhances programmability, while inducing negligible traffic and execution overhead. Moreover, I will present the scalable cache coherence protocols for the atomic block execution.
Bio: Xuehai Qian is a Ph.D candidate in the Department of Computer Science at the University of Illinois, Urbana-Champaign. His research focuses on multicore and parallel computer architecture, and programming models for parallelism. He received an MS in Computer Science from the Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS), and a BS in Computer Engineering from Beihang University, Beijing.
TALK: Prof Tor Aamodt, University of British Columbia - Mon, May 13 at 2pm in Woz (430 Soda)
Speaker: Tor Aamodt, University of British Columbia
Title: Efficient and Easily Programmable Accelerator Architectures
Abstract: Current projections suggest semiconductor scaling may end near the 7nm process node within 10 years. Energy efficiency is already a primary design goal due to the end of voltage scaling. Programmable accelerators such as graphics processing units (GPUs) can potentially enable further reductions in the cost of computation along with further increases in computing efficiency. However, GPUs are typically perceived as suitable only for a narrow range of applications such as high performance computing. This talk will describe recent research on hardware changes to broaden the range of applications that benefit from GPU-like accelerators. Approaches discussed will include introducing transactional memory and coherence into GPUs as well as improving cache utilization via hardware thread scheduling.
Bio: Tor Aamodt is an Associate Professor in the Department of Electrical and Computer Engineering at the University of British Columbia. Two of his papers on (GPU-like) accelerators have been selected as "Top Picks" from computer architecture conferences by IEEE Micro magazine. He is an Associate Editor for IEEE Computer Architecture Letters, Program Chair for ISPASS 2013, and has served on the program committee of several computer architecture conferences. He received his BASc (in Engineering Science), MASc and PhD at the University of Toronto. He worked at NVIDIA on the memory system of the first GPU supporting CUDA (G80).