GPUs are often underutilized when not running graphics-intensive
or special-purpose numeric computations, which are rare in consumer
workloads. Garbage Collection – an increasingly ubiquitous
workload on consumer devices – appears to be suitable for execution
on the GPU in order to utilize these otherwise unused cycles
and reduce the impact of the garbage collector on applications running on the main CPU. We investigate the challenges of offloading garbage collection to a GPU by examining the performance tradeoffs for the mark phase of a mark & sweep garbage collector.
We present an algorithm that demonstrates the feasibility of
this approach. We discuss a number of algorithmic design tradeoffs
required to leverage the strengths and capabilities of the GPU
hardware. Our algorithm has been integrated into a Java virtual
machine and we present promising performance results.