A Two-level GPU-Accelerated Incomplete LU Preconditioner for General Sparse Linear SystemsThis paper presents a parallel preconditioning approach based on incomplete
LU (ILU) factorizations in the framework of Domain Decomposition (DD) for
general sparse linear systems. We focus on distributed memory parallel
architectures, specifically, those that are equipped with graphic processing
units (GPUs). In addition to block Jacobi, we present general purpose two-level
ILU Schur complement-based approaches, where different strategies are presented
to solve the coarse-level reduced system. These strategies are combined with
modified ILU methods in the construction of the coarse-level operator, in order
to effectively remove smooth errors. We leverage available GPU-based sparse
matrix kernels to accelerate the setup and the solve phases of the proposed ILU
preconditioner. We evaluate the efficiency of the proposed methods as a
smoother for algebraic multigrid (AMG) and as a preconditioner for Krylov
subspace methods, on challenging anisotropic diffusion problems and a
collection of general sparse matrices.
arxiv.org