Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NDTensors] [ITensors] Implement optional rank reduction for QR/RQ/QL/LQ decompositions #1099

Closed
wants to merge 118 commits into from

Conversation

JanReimers
Copy link
Contributor

@JanReimers JanReimers commented Mar 20, 2023

Description

Additional functionality at the NDTensors level to strip out zero rows of R and the corresponding columns of Q, after QR decompoisiton. Same for LQ decomposition. The feature is controlled buy a new kwargs variable cutoff which is intended to act in a similar manner to cutoff in svd. Default is cutoff=-1.0 which is interpreted as no rank reduction. Typically for Float64, users will set 1e-15 < cutoff<1e-13 (depending on matrix size) to eliminate rows that are mostly likely zero to with roundoff error. This will leave Q*R invarient to within that precision level. The code uses column pivoting and removes bottom n rows in R norm(R_nn) < cutoff. When enabled ( cutoff>=0.0 ), rank reduction support for block sparse matrices will occur automaticlly on a block by block basis.

Rank reduction RQ/QL decomposition is not supported because lapack does not support column pivoting for these functions.

If practical and applicable, please include a minimal demonstration of the previous behavior and new behavior below.

Minimal demonstration of previous behavior

using ITensors
i=Index(5,"i")
A=ITensor(ones(5,5),i,i')
Q,R=qr(A,i)
@show dims(Q) dims(R) norm(A-Q*R)
dims(Q) = (5, 5)
dims(R) = (5, 5)
norm(A - Q * R) = 1.0706603116298291e-15

Minimal demonstration of new behavior

using ITensors
i=Index(5,"i")
A=ITensor(ones(5,5),i,i')
Q,R=qr(A,i;rr_cutoff=1e-15)
@show dims(Q) dims(R) norm(A-Q*R)
dims(Q) = (5, 1)
dims(R) = (1, 5)
norm(A - Q * R) = 5.551115123125783e-16

Rank reduction seems to improve the error in A-Q*R ... go figure!

How Has This Been Tested?

See additional unit tests in:
NDTensors/test/linearalgebra.jl
test/base/test_decomp.jl testset Rank revealing QR/LQ decomp on MPS dense near line 389

Checklist:

  • [✓ ] My code follows the style guidelines of this project. Please run using JuliaFormatter; format(".") in the base directory of the repository (~/.julia/dev/ITensors) to format your code according to our style guidelines.
  • [ ✓] I have performed a self-review of my own code.
  • [ ✓] I have commented my code, particularly in hard-to-understand areas.
  • [ ✓] I have added tests that verify the behavior of the changes I made.
  • [ No] I have made corresponding changes to the documentation.
  • [✓ ] My changes generate no new warnings.
  • [ ✓] Any dependent changes have been merged and published in downstream modules.

I plan to make a separate documentation PR for all of the QR/QL/LQ/RQ work, and corresponding updates to factorize().

Try and reduce complexity prior to implementing qr with combiners.
Also demonstrate index collection that fails for block sparse QR.
...With help from Niklas Tausendpfundt
@codecov-commenter
Copy link

codecov-commenter commented Mar 21, 2023

Codecov Report

Merging #1099 (a4b201b) into main (4b6b7f3) will decrease coverage by 5.53%.
The diff coverage is 28.84%.

❗ Current head a4b201b differs from pull request most recent head 6091157. Consider uploading reports for the commit 6091157 to get more accurate results

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@            Coverage Diff             @@
##             main    #1099      +/-   ##
==========================================
- Coverage   59.45%   53.92%   -5.53%     
==========================================
  Files          86       85       -1     
  Lines        8314     8278      -36     
==========================================
- Hits         4943     4464     -479     
- Misses       3371     3814     +443     
Impacted Files Coverage Δ
src/Ops/Ops.jl 52.41% <ø> (-21.38%) ⬇️
src/mps/abstractmps.jl 0.92% <ø> (-85.88%) ⬇️
src/mps/dmrg.jl 0.00% <0.00%> (-84.10%) ⬇️
src/mps/mpo.jl 0.00% <0.00%> (-97.30%) ⬇️
src/physics/autompo/opsum_to_mpo_generic.jl 0.00% <ø> (-95.75%) ⬇️
src/tensor_operations/matrix_decomposition.jl 89.93% <88.23%> (+9.57%) ⬆️

... and 66 files with indirect coverage changes

# of Q. X = R or L
#
function trim_rows(
Q::AbstractMatrix, X::AbstractMatrix, rr_cutoff::Float64; rr_verbose=false, kwargs...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just name these keyword arguments cutoff and verbose (i.e. drop the rr_).

Also, cutoff has a particular meaning in the context of SVD, where it is the sum of squares of the singular values.

I see here it is used as:

zeros = map((r) -> (maximum(abs.(X[r, 1:Xnc])) <= rr_cutoff), 1:Xnr)

so is checking that the maximum value in the row is less than the cutoff value/tolerance. Is that a standard approach for reducing the rank?

Maybe we can look at https://github.com/JuliaLinearAlgebra/LowRankApprox.jl and see what conventions they use for performing the rank reduction and what they call the keyword arguments (I see rtol and atol being used, which is used elsewhere in Julia say may be good choices).

Copy link
Contributor Author

@JanReimers JanReimers Mar 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue 1) Do we need a distinct name for the rank revailing cutoff? Right now I have to code set up so that the user can just call truncate!(InfiniteMPO) without first calling orthogonalize!(InfiniteMPO). The truncate function checks if the iMPO is already orthogonalized and if not is calls orthogonalize before truncating with svd. So with this setup we need distinct control of Rank Revealing QR cutoff and the svd cutoff. rtol would work for this.
Issue 2) How to decide if a row is effectively zero? Here is what Daniel Parker says:
image
I looked in his thesis and there is no more info on this topic.
I agree that testing rr_cutoff > sqrt(sum(X[r, 1:Xnc]^2)) is closer to what svd truncation does.

Issue #3: Hornets nest: Conventional rank reducting QR does not do what Daniel did. Conventional methods do column pivoting QR, which is akin to "I am not going to solve A=QR for you, instead I am going solve AP=QR, where P is a column permutation matrix" This puts all near zero rows of R into the lower right corner. It acutally orders the diagonals of R similar singular value ordering. Maybe Parker purposely avoided this because he knew it would jumble up the MPOs and destroy his regular form. Maybe column pivoting QR and doing A = QP^T PRP^T (P^T=transpose(P)) would get us back to where we are now, but I haven't tried it. We could insist the library needs to support conventional rank reducting QR with colunm pivoting, not Daniels version , but I think users want to solve A=QR, not AP=QR, in other words they don't want to be told "Sorry I am not solving your problem, I will only solve my choice of problem". Maybe there is an easy solution and I am just not seeing it.

Perhaps I should email Daniel and get his opinion on max(abs(row)) vs. rms(row), and if column pivoting QR was tested and rejected for some reason.

Copy link
Member

@mtfishman mtfishman Mar 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Within the qx functions, there is no fundamental reason why it needs a distinct name from cutoff. If we choose cutoff, we can choose different names in the higher level functions like truncate!(::MPO) to distinguish between the qr truncation tolerence/cutoff and the svd truncation/cutoff, for example with a code pattern like:

function truncate!(::MPO; qr_cutoff, svd_cutoff)
  svd(T1; cutoff=svd_cutoff)
  qr(T2; cutoff=qr_cutoff)
end

However, I am just considering different options since the qx tolerance/cutoff seems to have a different meaning from the svd tolerance/cutoff, and it may be nice to keep consistency with packages like LowRankApprox.jl (or use that package directly for rank reducing QR).

Pivoted QR is a pretty standard technique and I think it is reasonable to support that. For example, Julia has a pivoted QR:

julia> using LinearAlgebra

julia> A = randn(4, 4);

julia> Q, R, p = qr(A, ColumnNorm())
QRPivoted{Float64, Matrix{Float64}, Vector{Float64}, Vector{Int64}}
Q factor:
4×4 LinearAlgebra.QRPackedQ{Float64, Matrix{Float64}, Vector{Float64}}:
 -0.312837  -0.842554   -0.43312   -0.0681411
  0.856948  -0.445627    0.228859   0.121162
  0.341897   0.29066    -0.853861   0.263714
 -0.225565  -0.0838845   0.175932   0.954532
R factor:
4×4 Matrix{Float64}:
 3.20644  -0.309247  -1.45869    1.8667
 0.0       2.13545    0.101833  -0.14863
 0.0       0.0        1.81113   -0.0989319
 0.0       0.0        0.0       -0.0716154
permutation:
4-element Vector{Int64}:
 1
 3
 4
 2

julia> norm(Q * R[:, invperm(p)] - A)
5.902547983198874e-16

julia> R[:, invperm(p)]
4×4 Matrix{Float64}:
 3.20644   1.8667     -0.309247  -1.45869
 0.0      -0.14863     2.13545    0.101833
 0.0      -0.0989319   0.0        1.81113
 0.0      -0.0716154   0.0        0.0

which would give better rank reduction than non-pivoted QR:

julia> Q2, R2 = qr(A)
LinearAlgebra.QRCompactWY{Float64, Matrix{Float64}, Matrix{Float64}}
Q factor:
4×4 LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}:
 -0.312837  -0.899079   0.232979   0.198774
  0.856948  -0.181496   0.48104   -0.0360516
  0.341897  -0.116376  -0.599446   0.714302
 -0.225565   0.381016   0.595807   0.670046
R factor:
4×4 Matrix{Float64}:
 3.20644   1.8667    -0.309247  -1.45869
 0.0      -0.192372   1.64988    1.01009
 0.0       0.0       -1.35575    1.06884
 0.0       0.0        0.0       -1.062

I think it would be better to use pivoting, if possible (i.e. it would be better in general, so we should have it anyway), since it is the standard way to do rank reducing QR. We can just document that if you use pivoting and/or rank reduction then R is no longer guaranteed to be upper triangular (at the ITensor level, it isn't anyway because of the combiner transformation).

I think fundamentally the question is, does the Parker et al. algorithm actually rely on the upper/lower triangular form, or can it be framed more generally that you can do any orthogonal decomposition that does some rank reduction and preserves the proper block structure of the MPO? I have a hard time believing that such a detail as the ordering of the states within the block should matter, since it just corresponds to a simple gauge transformation by a permutation matrix on the links of the MPO. Once you gauge to a center bond/site, the left and right orthogonality centers are combined in order to perform the SVD, which would destroy the upper/lower triangular structure you were trying to preserve anyway.

I believe @LHerviou said he found he could reformulate the algorithm in terms of any orthogonal decomposition of the MPO block, which makes intuitive sense to me based on other tensor network gauging algorithms I know of.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My statement was just:
The QR/QL does not seem important to me. What seems to matter is:

  • Q -> the left part is a unitary, similar to what happens in left normalized MPS (assuming we are going for left canonical for now)
  • We preserve the block structure of the canonical form of the InfiniteMPO
    1 b c
    A d
    1
    using the convention of upper triangular iMPOs in their paper.
    Depending on left or right sweep, this means, orthogonalizing b or d with respect to 1, and then doing a QR on (b, A) or (A, d).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. rr_cutoff -> cutoff
  2. rr_verbose -> verbose
  3. Switch from row removal to column pivoting
    Done

@JanReimers
Copy link
Contributor Author

JanReimers commented Mar 29, 2023 via email

@JanReimers
Copy link
Contributor Author

JanReimers commented Apr 1, 2023 via email

@mtfishman
Copy link
Member

mtfishman commented Apr 2, 2023

It seems LAPACK only supports column pivoting for QR (and therefore indirectly for LQ), but not for QL/RQ. A table of supported orthogonal factorizations is here: https://netlib.org/lapack/lug/node44.html The MatrixFactorization.jl library also does not support pivoting ql, which is not a surprise since LAPACK does not.

If we determine that we don't care about the upper or lower triangular forms of the blocks (which presupposes that we can use pivoting for rank reduction in the first place for MPO compression), do we care about the QL/RQ decompositions? We could then just use pivoted QR and LQ to do the rank reduction. This may be why libraries don't offer pivoted QL/RQ, since once you accept pivoting the upper and lower triangular forms are ruined anyway. EDIT: Another way of seeing it is that a QL decomposition is just a special case of a pivoted QR decomposition.

@JanReimers
Copy link
Contributor Author

Update from my end: I emailed Daniel Parker and he was glad to hear the ITensors team implementing his methods, and simultaneously a little skeptical that we can use pivoted QR. On Saturday I did some experiments with pivoted QR in place of QL and it worked for about 90% of my unit test cases, and failed on the more complex Hubbard model. So will I try and get the Hubbard working. I suspect you guys are right and I just need to understand the details well enough to reduce it to practice in the code.

@JanReimers
Copy link
Contributor Author

The interface update using atol/rtol instead of cutoff is done.
It looks like ITensorGPU/src/tensor/culinearalgebra.jl needs to also be updated to handle many of the recent changes. But I don't have a way to test that.

@mtfishman mtfishman added the NDTensors Requires changes to the NDTensors.jl library. label May 6, 2024
@mtfishman
Copy link
Member

I'll close this since it will be easier to implement once NDTensors is rewritten in terms of NamedDimsArrays, BlockSparseArrays, etc. (#1250).

@mtfishman mtfishman closed this Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NDTensors Requires changes to the NDTensors.jl library.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants