2009 Par Lab Boot Camp - Short Course on Parallel Programming
Slides from lectures and a homework solution are posted at the bottom of the page.
Wednesday, August 19
Introduction to Parallel Architectures (John Kubiatowicz (UCB) | video part 1) | video part 2))
(9:30 until 12, with a break)
Why parallelism is our future, and what programmers need to know about the hardware in order to write efficient programs.
Lunch - 12-1:15pm
Shared Memory Programming with Pthreads, OpenMP and TBB (Katherine Yelick (UCB & LBNL) on PThreads | video, Tim Mattson (Intel) on OpenMP | video, Michael Wrinn (Intel) on TBB) | video, Heidi Pan (UCB) on Lithe | video
(1:15 – 4:45, 1 hour per lecture, with a 30 minute break)
This sequence of three lectures will discuss parallel programming on multicore platforms in detail, based on three of the leading programming tools.
Hands-on activity - 5 - 6pm
Reception - 6:15 - 7:15pm
Thursday, August 20
Sources of parallelism and locality in simulation (James Demmel (UCB) | video)
(8:45 - 9:45am)
We show how to recognize recurring opportunities to exploit parallelism in simulating real or artificial “worlds”, as well as opportunities to minimize data movement.
Architecting Parallel Software Using Design Patterns (Kurt Keutzer (UCB) | video)
(9:45 - 10:45am)
We give an overview of design patterns and how complex parallel software systems can be architected with them.
Data-Parallel Programming on Manycore Graphics Processors (Bryan Catanzaro (UCB) | video)
(11:15am - 12:15pm)
GPUs (Graphics Processing Units) have evolved into programmable manycore parallel processors. We will discuss the CUDA programming model, GPU architecture, and how to write high performance code on GPUs, illustrating with case studies from computer vision applications.
Lunch - 12:15 - 1:30pm
OpenCL (Tim Mattson (Intel) | video)
(1:30 – 2:30pm)
OpenCL is a recently released framework for writing portable programs that run on heterogeneous platforms consisting of CPUs, GPUs and other processors. It provides mechanisms for both task-parallelism and data-parallelism. OpenCL was designed in a collaboration including Apple, AMD, Intel and Nvidia.
Hands-on activity - 3 - 6pm
Friday, August 21
Computational Patterns of Parallel Programming (James Demmel (UCB) | video)
(8:45 - 10:45am)
We present in detail recurring computational patterns (eg dense linear algebra, FFTs) whose use also enables efficient and productive programming.
Building Parallel Applications (Ras Bodik (UCB) | video, Tony Keaveny (UCB) | video, Nelson Morgan (UCB) | video, David Wessel (UCB) | video)
(11:15am – 12:15pm)
We illustrate many of the programming tools discussed above in building a variety of exciting applications including computer music, medical image processing, speech recognition, and faster browsers.
Lunch - 12:15 - 1:30pm
Distributed Memory Programming in MPI and UPC (Katherine Yelick (UCB & LBNL) | video)
(1:30 – 2:30pm)
The largest and highest performance computers have distributed memory instead of shared memory, and are programmed using message passing (MPI), or new languages like UPC.
Performance Analysis Tools (Karl Fuerlinger (UCB) | video)
(2:30 – 3:30pm)
When a parallel program runs slower than expected, “performance debugging” may be done most effectively using a variety of tools that automatically instrument and display performance data.
Cloud Computing (Matei Zaharia (UCB) | video)
(4 - 5pm)
Cloud computing allows users to easily exploit large commercial compute clusters available at many companies. We discuss programming tools (eg Hadoop, MapReduce) that make them easy to use.
Lab Sessions for Attendees:
You can find the lab assignment by going to http://www.cs.berkeley.edu/~volkov/cs267.sp09/ and clicking on Assignment 2 at the bottom of the page. You will be asked to do the versions in PThreads and OpenMP. (You are also welcome to do the MPI version of assignment 2, and/or assignment 3)