Skip to content

Commit

Permalink
New memory management policies, new SketchingOperator concept, redesi…
Browse files Browse the repository at this point in the history
…gn of DenseDist and SparseDist, lots of docs (#108)

This branch was originally created to expand source and web
documentation. However, as I tried to write those docs I found myself
questioning many previous design decisions. (It turns out this is a
trend.)

Anyway, the PR is now gigantic in scope. I've tried to break down
changes into categories.

### Changes to source documentation
* Introduced macros for opaque matrices and vectors in rst-compatable
source code comments (``\mtxA,\mtxB,\mtxX,\mtxS,\mtxx,\mtxy``). Right
now those macros render their arguments in bold, but we could easily
switch to a different style.
* Removed most bracket-based indexing notation like ``\mat(A)[i, j]`` in
favor of mathematically typeset subscripts like ``\mat(A)_{ij}``.
* Removed (the vast majority of) rst macro directive definitions from
source code comments.
* Complete rewrites of DenseDist, DenseSkOp, SparseDist, and SparseSkOp
docs.
* Complete rewrites of COOMatrix, CSRMatrix, and CSCMatrix source docs
to match the style of DenseSkOp and SparseSkOp.
* Added source docs for the Axis enum (renamed from MajorAxis).
* Added source docs for the SketchingOperator concept.

### New features and identifiers

* Change the definition of LASOs so they actually implement LESS-Uniform
(with Rademachers as the scalar sub-gaussian distribution).
* Made the SketchingDistribution and SketchingOperator concepts apply
meaningful constraints.
* Extend sample_indices_iid_uniform to support generating Rademachers in
addition to the indices themselves, much like how repeated_fisher_yates
works. Exposed an overload that matches the original signature.
* Add a fill_sparse overload that's analogous to the many-argument
fill_dense overload.
* De-overloaded the verbose fill_dense and fill_sparse into new
functions: fill_dense_unpacked and fill_sparse_unpacked_nosub. The
latter has the nosub suffix because it doesn't have all the parameters
needed to select a submatrix, and I'll eventually want a
fill_sparse_unpacked that qualitatively matches fill_dense_unpacked.

### API changes : types

Adopted a consistent policy on semantics of memory ownership. This
included making sure all sketching operator and sparse matrix classes
have non-const own_memory members.

Added isometry_scale member to SketchingDistribution, DenseDist, and
SparseDist.

Added dim_major and dim_minor to DenseDist and SparseDist. Deliberately
keeping this out of the SketchingDistribution concept.

RNGState
* Defined a DefaultRNG type, equal to r123::Philox4x32. RNGState's
default RNG template parameter is now RNG = DefaultRNG.
* Add static_assert that counter length for RNGs is >= 2. Needed for
Box-Muller transforms and many other places.

SparseDist
* Changed initialization order.
* Add full_nnz member
* Added a constructor for SparseDist, and replaced many instances of
defining SparseDists with initializer lists. The construct has vec_nnz=4
and major_axis=Short by default.

SparseSkOp
* Add nnz member
* Remove known_filled member from SparseSkOp; the old check
"!S.known_filled" is equivalent to "S.nnz <= 0".

SparseMatrix classes
 * Remove reserve instance-methods.
* Add reserve_xyz free-functions to replace the reserve
instance-methods.
 * Removed some (mostly) redundant constructors.

Important changes that don't affect the public API:
* Renamed MajorAxis enum to just "Axis".
* Removed Axis::Undefined and ScalarDist::BlackBox.
* Make a (private!) helper BLASFriendlyOperator datatype, so that I
don't have to coerce DenseSkOp into supporting submatrices directly.


### API changes : free functions

* LSKGES and RSKGES no longer call ``fill_sparse`` on a user-provided
unfilled SparseSkOp. Instead, we make a temporary shallow-copy, fill
that copy, and delete it before returning. This is in preparation for
when LSKGES/RSKGES only generate submatrices that are needed for the
operation at hand, and it follows the same memory-allocation/ownership
semantics of dense sketching ([L/R]SK[GE/SP]3).
* Template sample_indicies_iid and sample_indices_iid_uniform to allow
general signed integers, rather than only int64_t.
* Removed the ability of spmm and sketch_sparse to operate on
submatrices of sparse data matrices. This basic functionality is still
preserved by ``left_spmm`` and ``right_spmm`` in the former case and
``lsksp3`` and ``rsksp3`` in the latter case.
* Removed the "val" argument from the ``overwrite_triangle`` utility
function. I was only using this function with val=0.0.
* remove has_fixed_nnz_per_col(S), since it wasn't used.
* Moved several utility functions into the main ``RandBLAS`` namespace.
* Significantly modernized ``print_colmaj``.

### Templating conventions
* Changed templating in ``sketch_sparse`` to avoid nested templating. By
nested templating I mean something like``template <typename T, typename
RNG>`` and having an argument of type ``DenseSkOp<T,RNG>``, while more
"flat" templating would use ``template <typename DenseSkOp, typename T =
DenseSkOp::scalar_t>`` and then have arguments of type ``T`` and
``DenseSkOp``.
* Changed templating functions that only accepted sparse sketching
operators but templated as ``SKOP``. They now template as ``SparseSkOp``
(which in a technical sense makes no difference, but does make clear
that it's supposed to be a ``SparseSkOp<T,RNG>`` for some parameters
``T`` and ``RNG``.
* Started templating RNGState in function calls as ``typename state_t =
RNGState<DefaultRNG>``.


### Web docs
* Actually wrote the tutorial on selecting parameters for DenseDist and
SparseDist!
* Added a "FAQ and Limitations" page to the web docs. Includes
discussion of C++ idioms we do/don't use.

### Other

Added a utility script for converting my awkward reStructuredText source
code docs of function arguments into doxygen format. I thought the
doxygen format might be viable since I tweaked the CSS, but it wasn't.
I'm keeping the script in case I want to revisit this later.

Increased the size of allowed error messages in
``randblas_error_if_msg``.

Added tests for LSKSP3 and RSKSP3.
  • Loading branch information
rileyjmurray authored Sep 10, 2024
1 parent 86c49ca commit 7280c0f
Show file tree
Hide file tree
Showing 76 changed files with 4,573 additions and 2,712 deletions.
11 changes: 11 additions & 0 deletions .cloc_exlude
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
bin/
CMake/
RandBLAS/CMakeFiles/
examples/build/
examples/sparse-data-matrices/
rtd/build/
rtd/sphinxext/
test/lib/googletest/
.github
.gitignore
.gitmodules
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
**/__pycache__/*
*_ignore*

# vim
*.sw*
Expand Down
Loading

0 comments on commit 7280c0f

Please sign in to comment.