Papers
arxiv:2601.17979

An Efficient Batch Solver for the Singular Value Decomposition on GPUs

Published on Apr 9
Authors:
,

Abstract

The singular value decomposition (SVD) is a powerful tool in modern numerical linear algebra, which underpins computational methods such as principal component analysis (PCA), low-rank approximations, and randomized algorithms. Many practical scenarios require solving numerous small SVD problems, a regime generally referred to as "batch SVD". Existing programming models can handle this efficiently on parallel CPU architectures, but high-performance solutions for GPUs remain immature. A GPU-oriented batch SVD solver is introduced. This solver exploits the one-sided Jacobi algorithm to exploit fine-grained parallelism, and a number of algorithmic and design optimizations achieve unmatched performance. Starting from a baseline solver, a sequence of optimizations is applied to obtain incremental performance gains. Numerical experiments show that the new solver is robust across problems with different numerical properties, matrix shapes, and arithmetic precisions. Performance benchmarks on both NVIDIA and AMD systems show significant performance speedups over vendor solutions as well as existing open-source solvers.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2601.17979
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2601.17979 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2601.17979 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2601.17979 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.