Performance Analysis of CULA on different NVIDIA GPU Architectures

The CULA is a next generation linear algebra package that uses the GPU as a co-processor to achieve speedups over existing linear algebra packages. CULA supports matrix inversion operation which helps in solving and factorizing the linear algebra matrices. The performance and actual speed ups of CULA depends heavily on the algorithm and the size of the data set. Additionally, the performance also varies with the GPU memory available for performing the computation, which varies with different flavors of NVIDIA GPU cards. This feature can be potentially explored by using the device interface model of CULA. So, the performance analysis in terms of GFLOPS can be done on Fermi, Tesla as well as Kepler Architectures. The performance analysis will involve running different applications on CULA dense R17. This study is important as it will reflect the advantages of using a particular architecture for getting optimized performance for Spike GPU solver.

Contributors: Prateek Gupta and Dan Negrut