@@ -13,18 +13,19 @@ more precision is necessary, `QRFactorization()` and `SVDFactorization()` are
1313the best choices, with SVD being the slowest but most precise.
1414
1515For efficiency, ` RFLUFactorization ` is the fastest for dense LU-factorizations.
16+ ` FastLUFactorization ` will be faster than ` LUFactorization ` which is the Base.LinearAlgebra
17+ (` \ ` default) implementation of LU factorization. ` SimpleLUFactorization ` will be fast
18+ on very small matrices.
19+
1620For sparse LU-factorizations, ` KLUFactorization ` if there is less structure
1721to the sparsity pattern and ` UMFPACKFactorization ` if there is more structure.
1822Pardiso.jl's methods are also known to be very efficient sparse linear solvers.
1923
2024As sparse matrices get larger, iterative solvers tend to get more efficient than
2125factorization methods if a lower tolerance of the solution is required.
2226
23- IterativeSolvers.jl uses a low-rank Q update in its GMRES so it tends to be
24- faster than Krylov.jl for CPU-based arrays, but it's only compatible with
25- CPU-based arrays while Krylov.jl is more general and will support accelerators
26- like CUDA. Krylov.jl works with CPUs and GPUs and tends to be more efficient than other
27- Krylov-based methods.
27+ Krylov.jl generally outperforms IterativeSolvers.jl and KrylovKit.jl, and is compatible
28+ with CPUs and GPUs, and thus is the generally preferred form for Krylov methods.
2829
2930Finally, a user can pass a custom function for handling the linear solve using
3031` LinearSolveFunction() ` if existing solvers are not optimally suited for their application.
@@ -83,6 +84,19 @@ LinearSolve.jl contains some linear solvers built in.
8384
8485- ` SimpleLUFactorization ` : a simple LU-factorization implementation without BLAS. Fast for small matrices.
8586
87+ ### FastLapackInterface.jl
88+
89+ FastLapackInterface.jl is a package that allows for a lower-level interface to the LAPACK
90+ calls to allow for preallocating workspaces to decrease the overhead of the wrappers.
91+ LinearSolve.jl provides a wrapper to these routines in a way where an initialized solver
92+ has a non-allocating LU factorization. In theory, this post-initialized solve should always
93+ be faster than the Base.LinearAlgebra version.
94+
95+ - ` FastLUFactorization ` the ` FastLapackInterface ` version of the LU factorizaiton. Notably,
96+ this version does not allow for choice of pivoting method.
97+ - ` FastQRFactorization(pivot=NoPivot(),blocksize=32) ` , the ` FastLapackInterface ` version of
98+ the QR factorizaiton.
99+
86100### SuiteSparse.jl
87101
88102By default, the SuiteSparse.jl are implemented for efficiency by caching the
@@ -117,7 +131,7 @@ MKLPardisoIterate(;kwargs...) = PardisoJL(;solve_phase=Pardiso.NUM_FACT_SOLVE_RE
117131 kwargs... )
118132```
119133
120- The full set of keyword arguments for ` PardisoJL ` are:
134+ The full set of keyword arguments for ` PardisoJL ` are:
121135
122136``` julia
123137Base. @kwdef struct PardisoJL <: SciMLLinearSolveAlgorithm
@@ -140,7 +154,7 @@ The following are non-standard GPU factorization routines.
140154!!! note
141155
142156 Using this solver requires adding the package LinearSolveCUDA.jl
143-
157+
144158- ` CudaOffloadFactorization() ` : An offloading technique used to GPU-accelerate CPU-based
145159 computations. Requires a sufficiently large ` A ` to overcome the data transfer
146160 costs.
0 commit comments