Publications

Sort Publications by:
Demmel, J., Elaihu D., Fox A., Kamil S., Lipshitz B., Schwartz O., et al. (2013).  Communication-Optimal Parallel Recursive Rectangular Matrix Multiplication. 27th IEEE International Parallel & Distributed Processing Symposium 2013. Abstract
Demmel, J., Gearhart A., Lipshitz B., & Schwartz O. (2013).  Perfect Strong Scaling Using No Additional Energy. 27th IEEE International Parallel & Distributed Processing Symposium 2013. Abstract
Ballard, G., Demmel J., Holtz O., Lipshitz B., & Schwartz O. (2012).  Communication-Optimal Parallel Algorithm for Strassen's Matrix Multiplication. 24th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2012). Abstract
Ballard, G., Demmel J., Holtz O., Lipshitz B., & Schwartz O. (2012).  Strong Scaling of Matrix Multiplication Algorithms and Memory-Independent Lower Bounds. 24th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2012). Abstract
Maas, M., Reames P., Morlan J., Asanovic K., Joseph A., & Kubiatowicz J. (2012).  GPUs: An Opprtunity for Offloading Garbage Collection. International Symposium on Memory Management - ISMM'12. Abstract
Solomonik, E., & Demmel J. (2012).  Matrix Multiplication on Multidimensional Torus Networks. 10th International Meeting on High-Performance Computing for Computational Science (VECPAR 2012) . Abstract
Bauer, M., Cook H. M., & Khailany B. (2011).  CudaDMA: Optimizing GPU Memory Bandwidth via Warp Specialization. Intertantional Conference on Super Computing, SC'11. Abstract
Battenberg, E., Freed A., & Wessel D. (2010).  Advances in the Parallelization of Music and Audio Applications. Proceedings of the International Computer Music Conference (2010). Abstract
Chong, J., Friedland G., Janin A., Morgan N., & Oei C. (2010).  Opportunities and Challenges of Parallelizing Speech Recognition. Second USENIX Workshop on Hot Topics in Parallelism (HotPar 2010). Abstract
Arnold, G., Holzl J., Koksal A. S., Bodik R., & Sagiv M. (2010).  Specifying and Verifying Sparse Matrix Codes. The 15th Annual ACM SIGPLAN International Conference on Functional Programming (ICFP 2010).
Colmenares, J. A., Bird S., Cook H. M., Pearce P., Zhu D., Shalf J., et al. (2010).  Resource Management in the Tessellation Manycore OS. 2nd USENIX Workshop on Hot Topics in Parallelism (HotPar '10). Abstract
Tan, Z., Waterman A., Cook H. M., Bird S., Asanović K., & Patterson D. (2010).  A Case for FAME: FPGA Architecture Model Execution. International Symposium on Computer Architecture (ISCA-2010). Abstract
Pan, H., Hindman B., & Asanovi´c K. (2010).  Composing Parallel Software Efficiently with Lithe. Programming Language Design and implementation (PLDI-2010). Abstract
Burnim, J., & Sen K. (2010).  DETERMIN: Inferring Likely Deterministic Specifications of Multithreaded Programs. 32nd International Conference on Software Engineering (ICSE'10). Abstract
Stojanovic, V., Joshi P., Batten C., Kwon Y. - J., Beamer S., Chen S., et al. (2010).  Design-Space Exploration for CMOS Photonic Processor Networks. Optical Fiber Communication Conference and Exposition and the National Optic Engineers Conference (OFC/ NFOEC). Abstract
Tan, Z., Asanović K., & Patterson D. (2010).  An FPGA-based Simulator for Datacenter Networks. The Exascale Evaluation and Research Techniques Workshop (EXERT 2010), at the 15th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2010). Abstract
Barman, S., Bodik R., Chandra S., Galenson J., Kimelman D., Rodarmor C., et al. (2010).  Programming with Angelic Nondeterminism. 37th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL '10). Abstract
Schmeder, A., Freed A., & Wessel D. (2010).  Best Practices for Open Sound Control. Proceeding of the Linux Audio Conference (2010). Abstract
Burnim, J., Juvekar S., & Sen K. (2009).  WISE: Automated Test Generation for Worst-Case Complexity. Proc. 31st International Conference on Software Engineering (ICSE'09). 463-473.
[Anonymous] (2009).  SNIFF: A Search Engine for Java using Free-Form Queries. Proc. Fundamental Approaches to Software Engineering (FASE'09), 2009. 385-400..
Catanzaro, B., Kamil S., Lee Y., Asanović K., Demmel J., Keutzer K., et al. (2009).  SEJITS: Getting Productivity and PerformanceWith Selective Embedded JIT Specialization. First Workshop on Programmable Models for Emerging Architecture (PMEA) at the 18th International Conference on Parallel Architectures and Compilation Techniques.
Ballard, G., Demmel J., Holtz O., & Schwartz O. (2009).  Communication Optimal Parallel and Sequential Cholesky Factorization. Symposium on Parallelism in Algorithms and Architectures. Abstract
Beamer, S., Stojanovic V., Asanović K., Batten C., & Joshi P. (2009).  Advantages of Silicon Photonics for Multi-socket Systems. 23rd International Conference on Supercomputing (ICS-09) . Abstract
Joshi, P., Batten C., Kwon Y. - J., Beamer S., Shamim I., Asanović K., et al. (2009).  Silicon-Photonic Clos Networks for Global On-Chip Communication. 3rd ACM/IEEE International Symposium on Networks-on-Chip (NoCS) 2009. Abstract
Naik, M., Park C. - S., Sen K., & Gay D. (2009).  Effective Static Deadlock Detection. 31st International Conference on Software Engineering, Vancouver (ICSE 09). Abstract
Satish, N., Harris M., & Garland M. (2009).  Designing Efficient Sorting Algorithms for Manycore GPUs. International Parallel and Distributed Processing Symposium.
Yelick, K. A., Gebis J., Oliker L., Shalf J., & Williams S. (2009).  Improving Memory Subsystem Performance Using ViVA. Architecture of Computing Systems - ARCS 2009, 22nd International Conference.
Burnim, J., Jalbert N., Sterigou C., & Sen K. (2009).  Looper: Lightweight Detection of Infinite Loops at Runtime. Proc. 24th IEEE/ACM nternational Conference on Automated Software Engineering (ASE'09).
Catanzaro, B., Su B. - Y., Sundaram N., Lee Y., Murphy M., & Keutzer K. (2009).  Efficient, High-Quality Image Contour Detection. International Conference on Computer Vision (ICCV). 2381-2388.
Burnim, J., & Sen K. (2009).  Asserting and Checking Determinism for Multithreaded Programs. 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering on European software engineering conference and foundations of software engineering symposium. Abstract
Joshi, P., Park C. - S., Naik M., & Sen K. (2009).  A randomized dynamic program analysis technique for detecting real deadlocks. Conference on Programming Language Design and Implementation (PLDI'09). 110 - 120. Abstract
Sundaram, N., Raghunathan A., & Chakradhar S. (2009).  A framework for efficient and scalable execution of domain-specific templates on GPUs. IEEE International Parallel and Distributed Processing Symposium (IPDPS). 1-12. Abstract
Chatterjee, K., Sen K., & Henzinger T. A. (2008).  Model-Checking omega-Regular Properties of Interval Markov Chains. Proc. 11th International Conference on Foundations of Software Science and Computation Structures (FoSSaCS'08),. 302-317.
Batten, C., Aoki H., & Asanović K. (2008).  The Case for Malleable Stream Architectures. Workshop on Streaming Systems at 41st International Symposium on Microarchitecture (MICRO-41).
Park, C. - S., & Sen K. (2008).  Randomized Active Atomicity Violation Detection in Concurrent Programs. 16th International Symposium on Foundations of Software Engineering (FSE'08). Abstract
Joshi, P., & Sen K. (2008).  Predictive Typestate Checking of Multithreaded Java Programs. 23rd IEEE/ACM International Conference on Automated Software Engineering (ASE'08). Abstract
Burnim, J., & Sen K. (2008).  Heuristics for Scalable Dynamic Test Generation. 23rd IEE/ACM International Conference on Automated Software Engineering (ASE '08).
Kannan, Y., & Sen K. (2008).  Universal Symbolic Execution and its Application to Likely Data Structure Invariant Generation. International Symposium on Software Testing and Analysis (ISSTA '08).
Tan, Z., Asanović K., & Patterson D. (2008).  An FPGA Host-Multithreaded Functional Model for Sparc v8. 35th International Symposium on Computer Architecture (ISCA-35).
Sen, K. (2008).  Race Directed Randomized Dynamic Analysis of Concurrent Programs. ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation (PLDI'08). Abstract
Solar-Lezama, A., Jones C., & Bodik R. (2008).  Sketching Concurrent Data Structures. ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation (PLDI'08).
Lee, J. W., Ng M. C., & Asanović K. (2008).  Globally-Synchronized Frames for Guaranteed Quality-of-Service in On-Chip Networks. 35th International Symposium on Computer Architecture (ISCA-35).
Chong, J., Yi Y., Faria A., Satish N., & Keutzer K. (2008).  "Data-Parallel Large Vocabulary Continuous Speech Recognition on Graphics Processors. Proceedings of the 1st Annual Workshop on Emerging Applications and Many Core Architecture (EAMA).
Catanzaro, B., Sundaram N., & Keutzer K. (2008).  A Map Reduce Framework for Programming Graphics Processors. Third Workshop on Software Tools for MultiCore Systems (STMCS). Abstract
Hampton, M., & Asanović K. (2008).  Compiling for Vector-thread Architectures. International Symposium on Code Generation and Optimization (CGO. Abstract
Nishtala, R., Almasi G., & Cascaval G. (2008).  Performance without Pain = Productivity, Data Layouts and Collectives in UPC. Principles and Practices of Parallel Programming (PPoPP) 2008. Abstract
Wessel, D. (2008).  Reinventing Audio and Music Computation for Many-Core Processors. Proceedings of the International Computer Music Conference 2008,.
Kamil, S., Shalf J., & Strohmaier E. (2008).  Power Efficiency in High Performance Computing. High-Performance, Power-Aware Computing (HPPAC 2008). Abstract
Ramanathan, M. K., Sen K., Grama A., & Jagannathan1 S. (2008).  Protocol Inference Using Static Path Profiles. Proc. 15th International Static Analysis Symposium (SAS'08),. 78-92.
Fuerlinger, K., & Moore S. (2008).  OpenMP-centric Performance Analysis of Hybrid Applications. 2008 IEEE International Conference on Cluster Computing (CLUSTER 2008).
Krashinsky, R., Batten C., & Asanović K. (2008).  Implementing the Scale Vector-Thread Processor. ACM Transaction on Design Automation of Electronic Systems (TODAES). 13(3),  Abstract
Keutzer, K., Hwu W. - M., & Mattson T. (2008).  The Concurrency Challenge. IEEE Design and Test of Computers. 25, Abstract
Williams, S., Datta K., Carter J., Oliker L., Shalf J., Yelick K. A., et al. (2008).  PERI: Autotuning Memory Intensive Kernels for Multicore. Journal of Physics, SciDAC PI Conference: Conference Series: 123012001. Abstract