Direct QR factorizations for tall-and-skinny matrices in MapReduce architectures

TitleDirect QR factorizations for tall-and-skinny matrices in MapReduce architectures
Publication TypeConference Paper
Year of Publication2013
AuthorsBenson, A., Gleich D., & Demmel J.
Conference NameIPDPS 2013

The QR factorization and the SVD are
two fundamental matrix decompositions with applications
throughout scientific computing and data
analysis. For matrices with many more rows than
columns, so-called “tall-and-skinny matrices,” there
is a numerically stable, efficient, communicationavoiding
algorithm for computing the QR factorization.
It has been used in traditional high performance
computing and grid computing environments. For
MapReduce environments, existing methods to compute
the QR decomposition use a numerically unstable
approach that relies on indirectly computing the
Q factor. In the best case, these methods require only
two passes over the data. In this paper, we describe
how to compute a stable tall-and-skinny QR factorization
on a MapReduce architecture in only slightly
more than 2 passes over the data. We can compute
the SVD with only a small change and no difference in
performance. We present a performance comparison
between our new direct TSQR method, a standard
unstable implementation for MapReduce (Cholesky
QR), and the classic stable algorithm implemented
for MapReduce (Householder QR). We find that our
new stable method has a large performance advantage
over the Householder QR method. This holds both
in a theoretical performance model as well as in an
actual implementation.