Accelerating 'fields' by revamping the Cholesky Decomposition
Ramakrishnaiah, V. B., Kumar, R. R. P., Paige, J., & Hammerling, D. (2015). Accelerating 'fields' by revamping the Cholesky Decomposition (No. NCAR/TN-518+STR). doi:10.5065/D6QF8QXR
The Geophysical Statistics project group within the Institute for Mathematics Applied to Geosciences (IMAGe) has been making use of Matrix Algebra on GPU and Multicore Architectures (MAGMA) to accelerate the Cholesky decomposition. The acceleration is motivated by a) Its frequent use in key compu... Show moreThe Geophysical Statistics project group within the Institute for Mathematics Applied to Geosciences (IMAGe) has been making use of Matrix Algebra on GPU and Multicore Architectures (MAGMA) to accelerate the Cholesky decomposition. The acceleration is motivated by a) Its frequent use in key computations in the spatial statistics R ‘fields’ package, b) Major bottleneck in ‘fields’ package execution and c) Operations involving big matrices make it suitable for parallelization. The Cholesky Decomposition was accelerated last summer using the MAGMA library. However, the performance of the accelerated version on multiple GPUs was observed to be unconventional - a) Execution time on multiple GPUs was higher in comparison to single GPU execution and b) Deep copy and in-place algorithms had opposite impacts on performance when executed on one and multiple GPUs. Our CPU and GPU profiling, conducted this summer, explains the unconventional behavior observed in the multi-GPU executions. The profiling provided insight to further accelerate the Cholesky Decomposition hierarchically– a) accelerating the underlying C function, b) reducing the function call overhead in R and c) optimizing the R environment. We were able to optimize the code and the environment to get a speedup greater than 75x (single precision) and 65x (double precision) for large matrices. We also found a potential way to improve the MAGMA functions by replacing the communications with direct device-to-device calls. Show less