2010 Par Lab Boot Camp - Short Course on Parallel Programming

Program Details

Homework Assignments
Although accounts on the parallel server for the hands-on activity are
only available to on-site attendees, online attendees are welcome to do
the homework on their own platforms. You can find the lab assignment by going to:


Please Note: Slides from lectures and a homework solution will be posted at the bottom of the page.
This preliminary agenda is subject to change.

Monday, August 16

*9:00 - 9:30- Introduction and Welcome (Dave Patterson, UCB & Jim Demmel, UCB) Slides and Video

Greeting, Overview, and talk about logistics

*9:30 - 12:00- Introduction to Parallel Architecures and Pthreads (John Kubiatowicz, UCB) Slides and Video
Why parallelism is our future, and what programmers need to know about the hardware in order to write efficient programs. We also introduce parallel programming with Pthreads. (includes 30 min break)

*12:00 - 1:15- Lunch

*1:15 - 2:15- Shared Memory Programming with OpenMP (Barbara Chapman, University of Houston) Slides and Video

We discuss parallel programming on multicore using OpenMP.

*2:15 - 3:00- Shared Memory Programming with TBB (Michael Wrinn, Intel) Slides and Video
We discuss parallel programming on multicore using TBB.

*3:00 - 3:30- Break

*3:30 - 4:30- Parallel Advisor (Mark Davis, Intel) Slides and Video
Adding parallelism to programs using Intel's Parallel Advisor: No Parallelism Experience Required

*4:30 - 5:00- Break/ Transition to HP Auditorium, 3rd Fl. Soda Hall

*5:00 - 6:00- Hands-on Activity

*6:00 Reception in the Wozniak Lounge, Soda Hall

Tuesday, August 17

*8:45 - 9:45- Sources of parallelism in simulation (Jim Demmel, UCB) Slides and Video
We show how to recognize recurring opportunities to exploit parallelism in simulating real or artificial "worlds", as well as opportunities to minimize data movement.

*9:45 - 10:45- Distributed memory programm in MPI and UPC (Kathy Yelick, UCB) Slides and Video
The largest and highest performance computers have distributed memory instead of shared memory, and are programmed using message passing (MPI)or new languages like UPC.

*10:45 - 11:15- Break

*11:15 - 12:15- Debugging parallel code (Jacob Burnim, UCB) Slides and Video

We survey recent results and useful tools for debugging parallel programs.

*12:15 - 1:30- Lunch

*1:30 - 2:30- Architecting parallel software with design patterns (Kurt Keutzer, UCB) Slides and Video
We give an overview of design patterns and how complex parallel software systems can be architected with them.

*2:30 - 3:00- Break/ Transition to the Wozniak Lounge, 4th Fl. Soda Hall

*3:00 - 6:00- Hands-on Activity

Wednesday, August 18

*8:45 - 10:45- Autotuning of Common Computational Patterns (Jim Demmel, UCB) Slides and Video
We discuss several recurring computational patterns (eg linear algebra and stencils) whose fastest implementations are written automatically by other programs called autotuners.

*10:45 - 11:15- Break

*11:15 - 12:15- Applications (Nelson Morgan, David Wessel, Tony Keaveny, Leo Meyerovich, UCB) Tony Keaveny's Slides and Tony Keaveny's Video. Here's David Wessel's Video. Here's Nelson Morgan's Video
We illustrate many of the programming tools discussed above in building a variety of exciting applications including computer music, medical image processing, speech recognition and faster browsers.

*12:15 - 1:30- Lunch

*1:30 - 2:30- Cloud computing (Matei Zaharia) Slides and Video
Cloud computing allows users to easily exploit large commerical compute clusters available at many companies. We discuss programming tools (eg Hadoop, MapReduce) that make them easy to use.

*2:30 - 3:30- Performance Tools (Karl Fuerlinger, UCB) Slides and Video
When a parallel program runs slower than expected, "performance debugging" may be done most effectively using a variety of tools that automatically instrument and display performance data.

*3:30 - 4:00- Break

*4:00 - 5:00- GPU, CUDA, OpenCL programming (Mark Murphy, UCB) Slides, more slides and a Video
GPUs (Graphics Processing Units) have evolved into programmable manycore parallel processors. We will discuss the CUDA programming model, GPU architecture, and how to write high performance code on GPUs, illustrating with case studies from computer vision and medical imaging. We also discuss OpenCL, a recently released framework for portable parallel programming on heterogeneous platforms.