Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a block sparse format for the observation operator #2

Open
wants to merge 39 commits into
base: master
Choose a base branch
from

Conversation

DWesl
Copy link
Collaborator

@DWesl DWesl commented Mar 6, 2019

Description

I store the influence functions in nearly this format.
With only two weeks of the six having an important influence on any given observation, this should be able to save significant amounts of memory.
Really using this requires support for calculating the quadratic form H B HT where H is a BSR matrix and B is a DaskKroneckerProductOperator.
I wrote code to do this operation and also to use it in save_sum, as well as tests to make sure it works.

Motivation and Context

The observation operator is the next biggest memory user.
I would like to reduce this so I can do inversions for longer time periods.

How Has This Been Tested?

I wrote new tests for this functionality, comparing the results to expected results.
I also made sure the existing tests passed.

I ran the tests on linux with python 2.7 and 3.6

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • I give permission for my code to be distributed under the existing license

DWesl and others added 13 commits March 4, 2019 17:22
The influence functions are really big; with the savings in the
background error covariance elsewhere in this package, they are the
largest.  For regional inversions, we only need a few weeks of
transport before everything leaves the domain.  This tries to keep the
space savings in the inversion, not just on disk.
This feature needs XArray support for passing bsr_matrix and for
stacking previously-stacked dimensions.
…apping.

This avoids problems in trying to patch XArray in tox.
I also got the tests passing, working around the problems with
stacking a stacked dimension.
I still have to test that everything's in the right place.
… a BSR obs op.

Only the most basic of tests at the moment.
I don't feel like creating a full bsr_matrix from scratch at the
moment.
Matrix of all ones; a reshaped 'arange', and random.
Also get the function to actually do the right thing.
I trust it a bunch more now.
Having the function to do this is good.  Having it be used where
applicable is even better.
`ceil` doesn't return an int, and dicts have no defined order.
Use bitwise xor instead of many nested conditionals.
@DWesl DWesl self-assigned this Mar 6, 2019
@DWesl
Copy link
Collaborator Author

DWesl commented Mar 6, 2019

I'm still working through actually using this in a realistic inversion, rather than a test.
The existing code should be good for review, but changing the resolution of an observation operator stored as a bsr_matrix rather than a DataArray is giving me some trouble.
The new remapping code can handle the spatial part, now I just need to handle the temporal part in a manner consistent with pd.resample and friends.

…his.

I need the returned shape to match that of the fluxes, so make sure
that happens.
Also start using these new features in the inversion.
I have tests for all the failure modes, and they should catch any
major problems.
The previous version was incomplete, in particular, it highlighted the
need for temporal remapping then stopped.  This implements the
temporal remapping and fixes a few typos in the code.
Also avoid overwriting old results so I can keep using those if this
doesn't work.
Also try to fix issues found therein.  Namely, accesing a
row-major-ish array in column-major order.  Since that was the larger
array, the savings from that should dominate over the loss from
writing to a column-major array in row-major order.
It turns out the extra tests are unnecessary, but it's better to
double-check than not check at all.

Prompted by accidentally creating a BSR matrix with shape smaller than
blocksize.  I should probably report that to scipy.
@DWesl DWesl added the enhancement New feature or request label Mar 10, 2019
Just including that title doesn't actually help anyone who doesn't
already know what's going on.  This isn't a complete description of
what they're for and why someone might want to use them, but it's a
start.

The project really needs a better introduction to how to use the code
than "just look at the three thousand lines of tests, each of which
does a piece of what you want to do, but without explanation, and the
thousand lines of extremely messy code I've been tinkering with for a
year and a half through a couple of big changes in how the code works,
which occasionally has some explanation that might still be valid."
API documentation only really works once you know what's going on.

That or make a main script that reads a config file and performs the
inversion described.  The wrapper code is a first step in that
direction, but I don't like the interface terribly much.
@DWesl
Copy link
Collaborator Author

DWesl commented Mar 11, 2019

My implementation of the temporal remapping code was prohibitively slow when used for real cases.
The change in loop order seems to have fixed that and made it merely slow.
I'm looking into a sparse matrix multiply instead of a dense one to replace .groupby_bins().sum() for the influence functions, which should help.

@DWesl DWesl marked this pull request as ready for review March 15, 2019 21:43
@DWesl
Copy link
Collaborator Author

DWesl commented Mar 15, 2019

This is working for my OSSE, but I should still check what order the BSR matrix is using for flux times.

This is much faster than a dense matrix multiply.
This should help reduce clutter in the logs.
I apparently forgot to run the style checks on the new tests.  Fix
problems there and a few other places.
Conflicts:
	src/inversion/remapper.py
	src/inversion/tests.py
@DWesl
Copy link
Collaborator Author

DWesl commented Mar 22, 2019

I wrote more extensive tests of the interaction between BSR matrices and the operators. The results are not pretty. Do not use or merge this branch until I figure out what's going on and fix it.

Cholesky doesn't understand linear operators.
Clean up the code a bit, so future changes only have to touch the one
area.
This currently fails.  No idea why, especially for the iterative methods.
The tests take over ten minutes each on Travis.  They should take maybe
five, even given extraneous load on the machines.
I should be able to do things with the returned covariance.  This
should also work if the thing inside the MatrixLinearOperator is a
sparse matrix.

Trying to narrow down the unrelated failures from the BSR obs op
tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant