diff --git a/.gitignore b/.gitignore index 2138c923..2ced0696 100644 --- a/.gitignore +++ b/.gitignore @@ -1,4 +1,5 @@ **/__pycache__/* +*_ignore* # vim *.sw* diff --git a/RandBLAS/base.hh b/RandBLAS/base.hh index 0cf9138b..8e78805c 100644 --- a/RandBLAS/base.hh +++ b/RandBLAS/base.hh @@ -279,6 +279,7 @@ enum class MajorAxis : char { Undefined = 'U' }; + #ifdef __cpp_concepts // ============================================================================= /// @verbatim embed:rst:leading-slashes @@ -295,16 +296,16 @@ enum class MajorAxis : char { /// :nowrap: /// /// \begin{gather} -/// \alpha^2 \cdot \mathbb{E}\left[ \mtxS^T\mtxS \right]=\mathbf{I}_{c \times c}& \nonumber \\ -/// \,\beta^2 \cdot \mathbb{E}\left[ \mtxS{\mtxS}^T\, \right]=\mathbf{I}_{r \times r}& \nonumber +/// \theta^2 \cdot \mathbb{E}\left[ \mtxS^T\mtxS \right]=\mathbf{I}_{c \times c}& \nonumber \\ +/// \,\phi^2 \cdot \mathbb{E}\left[ \mtxS{\mtxS}^T\, \right]=\mathbf{I}_{r \times r}& \nonumber /// \end{gather} /// -/// hold for some :math:`\alpha > 0` and :math:`\beta > 0`. +/// hold for some :math:`\theta > 0` and :math:`\phi > 0`. /// /// The *isometry scale* of the distribution -/// is :math:`\gamma := \alpha` if :math:`c \geq r` and :math:`\gamma := \beta` otherwise. If you want to +/// is :math:`\alpha := \theta` if :math:`c \geq r` and :math:`\alpha := \phi` otherwise. If you want to /// sketch in a way that preserves squared norms in expectation, then you should sketch with -/// a scaled sample :math:`\gamma \mtxS` rather than the sample itself. +/// a scaled sample :math:`\alpha \mtxS` rather than the sample itself. /// /// **Programmatic description** /// @@ -343,44 +344,9 @@ concept SketchingDistribution = requires(SkDist D) { { D.isometry_scale } -> std::same_as; }; #else -/// .. math:: -/// -/// {\D}\texttt{.isometry\_scale} = \begin{cases} \alpha &\text{ if } r \leq c \\ \beta &\text{ if } r > c \end{cases}~. #define SketchingDistribution typename #endif -// ============================================================================= -/// \fn isometry_scale_factor(SkDist D) -/// @verbatim embed:rst:leading-slashes -/// -/// All sketching distributions in RandBLAS satisfy the following properties: -/// if :math:`\D` is a distribution over :math:`r \times c` matrices and -/// :math:`\mtxS` is a sample from :math:`\D`, then -/// (1) :math:`\mathbb{E}\mtxS = \mathbf{0}_{r \times c}` and (2) -/// -/// .. math:: -/// :nowrap: -/// -/// \begin{gather*} -/// \alpha^2 \cdot \mathbb{E}\left[ \mtxS^T\mtxS \right]=\mathbf{I}_{c \times c}& \\ -/// \,\beta^2 \cdot \mathbb{E}\left[ \mtxS{\mtxS}^T\, \right]=\mathbf{I}_{r \times r}& -/// \end{gather*} -/// -/// hold for some :math:`\alpha > 0` and :math:`\beta > 0`. -/// -/// This function returns -/// -/// .. math:: -/// -/// \gamma = \begin{cases} \alpha &\text{ if } r \leq c \\ \beta &\text{ if } r > c \end{cases}~. -/// -/// If you want to sketch in a way that preserves squared norms in expectation, then you -/// should sketch with a scaled sample :math:`\gamma \mtxS` rather than the sample itself. -/// -/// @endverbatim -template -inline T isometry_scale_factor(SkDist D); - #ifdef __cpp_concepts // ============================================================================= diff --git a/rtd/source/FAQ.rst b/rtd/source/FAQ.rst new file mode 100644 index 00000000..2e4ceb7f --- /dev/null +++ b/rtd/source/FAQ.rst @@ -0,0 +1,148 @@ +FAQ and Limitations +============================== + + +(In)frequently asked questions about our design decisions +------------------------------------------------------------ + +Why the name RandBLAS? + RandBLAS derives its name from BLAS: the basic linear algebra subprograms. Its name evokes the *purpose* of BLAS rather + than the acronym. Specifically, the purpose of RandBLAS is to provide high-performance and reliable functionality that + can be used to build sophisticated randomized algorithms for matrix computations. + + The RandBLAS API is also as close as possible to the BLAS API while remaining polymorphic. Some may find this + decision dubious, since the BLAS API is notoriously cumbersome for new users. We believe that these downsides + are made up for by the flexibilty and portability of a BLAS-like API. It is also our hope that popular high-level + libraries that already wrap BLAS might find it straightforward to define similar wrappers for RandBLAS. + +DenseDist and SparseDist are very simple immutable structs. Why bother having constructors for these classes, when you could just use initialization lists? + DenseDist and SparseDist have common members and type-specific members. + We wanted the common members to appear earlier initialization order, since that makes it easier to + preserve RandBLAS' polymorphic API when called from other languages. + + Two important common members are major_axis and isometry_scale. Users rarely need to think about the former (since there are + sensible defaults for DenseDist and SparseDist) and they never need to make decisions for the latter + What's more, there are type-specific members that the user should be mindful of, and would be ill-suited to the trailing + positions in an initialization list. + + Using constructors therefore has three benefits. + + 1. We can put the type-specific mebers earlier in the argument list, + 2. We can include the default value for major_axis as a trailing constructor argument, and + 3. We can ensure that isometry_scale is always set correctly. + +Why don't you automatically scale your sketching operators to give partial isometries in expectation? + There are a few factors that led us to this decision. None of these factors is a huge deal, alone, but they become significant when considered together. + + 1. Many randomized algorithms are actually invariant under rescaling of the sketching operators that they use internally. + 2. Sketching operators are easier to describe if we think in terms of their "natural" scales before considering their use as tools for dimension reduction. + For example, the natural scale for DenseSkOps is to have entries that are sampled iid from a mean-zero and variance-one distribution. + The natural scale for SparseSkOps (with major_axis==Short) is for nonzero entries to have absolute value equal to one. + Describing these operators in their isometric scales would require that we specify dimensions, which dimension is larger than the other, + and possibly additional tuning parameters (like vec_nnz for SparseSkOps). + 3. It's easier for us to describe how implicit concatenation of sketching operators works (see :ref:`this part ` of our tutorial) + if using the isometry scale is optional, and off by default. + +Why are all dimensions 64-bit integers? + RandNLA is interesting for large matrices. It would be too easy to have an index overflow if we allowed 32-bit indexing. + We do allow 32-bit indexing for buffers underlying sparse matrix datastructures, but we recommend sticking with 64-bit. + +I looked at the source code and found weird function names like "lskge3," "rskges," and "lsksp3." What's that about? + Makes it easier to call from languages that don't support overloads. + Short and specific names make it possible to communicate efficiently and precisely (useful for test code and developer documentation). + +Why does sketch_symmetric not use a "side" argument, like symm in BLAS? + There are many BLAS functions that include a "side" argument. This argument always refers to the argument with the most notable structure. + In symm, the more structured argument is the symmetric matrix. + In RandBLAS, the more structed argument is always the sketching operator. Given this, we saw three options to move forward. + + 1. Keep "side," and have it refer to the position of the symmetric matrix. This is superficially simmilar to the underlying BLAS convention. + 2. Keep "side," and have it refer to the position of the sketching operator. This is similar to BLAS at a deeper level, but people could + easily use it incorrectly. + 3. Dispense with "side" altogether. + + We chose the third option since that's more in line with modern APIs for BLAS-like functionality (namely, std::linalg). + + +(In)frequently asked questions about RandBLAS' capabilities +----------------------------------------------------------- + +How do I call RandBLAS from other languages? + First, this depends on whether your language supports overloads. + + * To get an important thing out of the way: we use both formal C++ overloads and C++ templates. The latter mechanism might as well constitute an overload from the perspective of other languages. + * We do have canonical function names to address overloads in (1) the side of an operand in matrix-matrix products and (2) the family of sketching distribution. + * Our templating of numerical precision should be resolved by prepending "s" for single precision or "d" for double precision on any classes and function names. + + This also depends on whether your language supports classes that can manage their own memory. + + * The full API for DenseSkOp and SparseSkOp requires letting them manage their own memory. If you use the appropriate constructors then they'll let you manage all memory. + +Can functions of the form ``sketch_[xxx]`` do something other than sketching? + Absolutely. It can do lifting, which is needed in some algorithms. It can also apply a square submatrix of a sketching operator (useful for distributed applications), in which case the output matrix isn't any smaller than the input matrix. + +Can I sketch a const symmetric matrix that's only stored in an upper or lower triangle? + Yes, but there are caveats. First, you can only use dense sketching operators. Second, you'll have to call fill_dense(layout, dist, … buff …, rngstate) and then use buff in your own SYMM function. + + +Unavoidable limitations +------------------------ + +No complex domain support: + BLAS' support for this is incomplete. You can't mix real and complex, you can't conjugate without transposing, etc… + We plan on revisiting the question of complex data in RandBLAS a few years from now. + +No support for sparse-times-sparse (aka SpGEMM): + This will probably "always" be the case, since we think it's valuable to keep RandBLAS' scope limited. + +No support for subrampled randomized trig transforms (SRFT, SRHT, SRCT, etc...): + We'd happily accept a contribution of a randomized Hadamard transform (without subsampling) + that implicitly zero-pads inputs when needed. Given such a function we could figure out + how we'd like to build sketching operators on top of it. + +No support for DenseSkOps with Rademachers: + We'd probably need support for mixed-precision arithmetic to fully realize the advantage of + Rademacher over uniform [-1,1]. It's not clear to me how we'd go about doing that. There + *is* the possibility of generating Rademachers far faster than uniform [-1, 1]. The implementation + of this method might be a little complicated. + + +Limitations of calling RandBLAS from other languages +---------------------------------------------------- + +We routinely use function overloading, and that reduces portability across languages. +See below for details on where we stand and where we plan to go to resolve this shortcoming. + +We have a consistent naming convention for functions that involve sketching operators + * [L/R] are prefixes used when we need to consider left and right-multiplication. + * The characters "sk" appearing at the start of a name or after [L/R] indicates that a function involves taking a product with a sketching operator. + * Two characters are used to indicate the structure of the data in the sketching operatation. + The options for the characters are {ge, sy, ve, sp}, which stand for general, *explicitly* symmetric, vector, and sparse (respectively). + * A single-character [X] suffix is used to indicate the structure of the sketching operator. The characters are "3" (for dense sketching + operators, which would traditionally be applied with BLAS 3 function) and "s" (for sparse sketching operators). + +Functions that implement the overload-free conventions + * [L/R]skge[X] for sketching a general matrix from the left (L) or right (R) with a matrix whose structure is indicated by [X]. + C++ code should prefer overloaded sketch_general + * [L/R]sksp3 for sketching a sparse matrix from the left (L) (L) or right (R) with a DenseSkOp. + C++ code should prefer overloaded sketch_sparse, unless operating on a submatrix of a COO-format sparse data matrix is needed. + +Functions that are missing implementations of this convention + * [L/R]skve[X] for sketching vectors. This functionality is availabile in C++ with sketch_vector + * [L/R]sksy[X] for sketching a matrix with *explicit symmetry*. This functionality is availabile in C++ with sketch_symmetric. + +Some discussion + + Our templating for numerical precision should be resolved by prepending "d" for double precision or "s" for single precision + + RandBLAS requires a consistent naming convention across an API that supports multiple structured operands (e.g., sketching sparse data), + while conventions in the BLAS API only need to work when one operand is structured. + This is why our consistent naming convention might appear "less BLAS-like" than it could be. + + All of these overload-free function names have explicit row and column offset parameters to handle submatrices of linear operators. + However, the overloaded versions of these functions have *additional* overloads based on setting the offset parameters to zero. + + +We have no plans for consistent naming of overload-free sparse BLAS functions. The most we do in this regard is offer functions +called [left/right]_spmm for SpMM where the sparse matrix operand appears on the left or on the right. + diff --git a/rtd/source/api_reference/skops_and_dists.rst b/rtd/source/api_reference/skops_and_dists.rst index 2393f17f..aa87daf7 100644 --- a/rtd/source/api_reference/skops_and_dists.rst +++ b/rtd/source/api_reference/skops_and_dists.rst @@ -21,7 +21,7 @@ Fundamentals ******************************************************************** - .. very similar effects can be acheived in C, Fortran, or Julia. While there are + .. very similar effects can be achieved in C, Fortran, or Julia. While there are .. certainly some popular programming languages that don't support this kind of API .. (e.g., MATLAB, Python, and R), accessing RandBLAS from these languages should .. be mediated operator-overloaded objects in a way that's analogous to how one @@ -29,13 +29,17 @@ Fundamentals RandBLAS has a polymorphic free-function API. We have spent a significant amount of effort on minimizing the number of RandBLAS-specific datastructures needed in order -to acheive the polymorphic API. +to achieve that polymorphism. -RandBLAS is very light on C++ idioms. The main C++ idioms we use are +RandBLAS is very light on C++ idioms. The main idioms we use are templating and function overloading, plus some mild memory management -with destructor methods for structs. +with destructor methods for structs. The only place we even use inheritance is +in our test code! -:ref:`Test of sketch updates `. +RandBLAS has a bunch of functions that aren't documented on this website. +If such a function looks useful, you should feel free to use it. If you +end up doing that and you care about your code's compatibility with future +versions of RandBLAS, then please let us know by filing a quick GitHub issue. Abstractions diff --git a/rtd/source/conf.py b/rtd/source/conf.py index bee49742..e2a810ad 100644 --- a/rtd/source/conf.py +++ b/rtd/source/conf.py @@ -97,4 +97,4 @@ # numfig = True math_numfig = True math_eqref_format = "Eq. {number}" # use a non-breaking-space unicode character. -numfig_secnum_depth = 1 \ No newline at end of file +numfig_secnum_depth = 1 diff --git a/rtd/source/index.rst b/rtd/source/index.rst index 0efd91d9..a28820b5 100644 --- a/rtd/source/index.rst +++ b/rtd/source/index.rst @@ -6,6 +6,7 @@ Tutorial API Reference Changelog + FAQ and Limitations .. default-domain:: cpp diff --git a/rtd/source/installation/index.rst b/rtd/source/installation/index.rst index 736d3111..0178917d 100644 --- a/rtd/source/installation/index.rst +++ b/rtd/source/installation/index.rst @@ -15,8 +15,6 @@ RandBLAS is most useful when called from programs that can access LAPACK, or an equivalent library for dense matrix computations. However, we don't require that such a library is available. -RandBLAS uses `C++20 concepts `_. -Make sure your compiler supports these! CMake users ----------- @@ -35,6 +33,11 @@ Check out our `examples `_ to implement high-level randomized algorithms. +.. warning:: + + Make sure to use the flag ``-Dblas_int=int64`` in the CMake configuration line for BLAS++ + If you don't do that then you might get int32, which can lead to issues for large matrices. + Everyone else ------------- Strictly speaking, we only need three things to use RandBLAS in other projects. diff --git a/rtd/source/tutorial/updates.rst b/rtd/source/tutorial/updates.rst deleted file mode 100644 index 765df49b..00000000 --- a/rtd/source/tutorial/updates.rst +++ /dev/null @@ -1,114 +0,0 @@ -********************************************************************************************* -Updating sketches with dense sketching operators -********************************************************************************************* - - .. |denseskop| mathmacro:: \texttt{DenseSkOp} - .. |seedstate| mathmacro:: \texttt{seed_state} - .. |nextstate| mathmacro:: \texttt{next_state} - .. |mtx| mathmacro:: {} - - -RandBLAS makes it easy to *implicitly* extend an initial sketching -operator :math:`\mtx{S}_1` into an augmented operator :math:`\mtx{S} = [\mtx{S}_1; \mtx{S}_2]` or :math:`\mtx{S} = [\mtx{S}_1, \mtx{S}_2]`. -There are four scenarios that you can find yourself in where -this can be done without generating :math:`S` from scratch. -In all four scenarios, the idea is to -use the :math:`\nextstate` of :math:`\mtx{S}_1` as the -:math:`\seedstate` of :math:`\mtx{S}_2`. - -There are two reasons why you'd want to -extend a sketching operator; you might be trying to improve statistical -quality by increasing sketch size, or you might be -incorporating new data into a sketch of fixed size. -The unifying perspective on the former scenarios is that they both add -*long-axis vectors* to the sketching operator. -The unifying perspective on -Scenarios 3 and 4 is that they both add *short-axis vectors* to the -sketching operator. - -:math:`\texttt{DenseDist}` objects have a :math:`\texttt{major_axis}` member, which states -whether operators sampled from that distribution are short-axis or -long-axis major. So when you specify the major axis for a sketching -operator, you're basically saying whether you want to keep open the possibility of -improving the statistical quality of a sketch or updating a sketch to -incorporate more data. - - -Increase sketch size by adding long-axis vectors. -================================================= - - Suppose :math:`\mtx{S}_1` is a *wide* :math:`d_1 \times m` row-wise - :math:`\denseskop` with seed :math:`c`. It's easy to generate a - :math:`d_2\times m` row-wise :math:`\denseskop` :math:`\mtx{S}_2` in such a way that - :math:`\mtx{S} = [\mtx{S}_1; \mtx{S}_2]` is the same as the :math:`(d_1 + d_2) \times m` row-wise - :math:`\denseskop` with seed :math:`c`. - - This scenario arises if we have a fixed data matrix :math:`\mtx{A}`, an initial - sketch :math:`\mtx{B}_1 = \mtx{S}_1 \mtx{A}`, and we decide we want a larger sketch for - statistical reasons. The updated sketch :math:`\mtx{B} = \mtx{S} \mtx{A}` can be expressed as - - .. math:: - - \mtx{B} = \begin{bmatrix} \mtx{S}_1 \mtx{A} \\ \mtx{S}_2 \mtx{A} \end{bmatrix}. - - Now suppose :math:`\mtx{S}_1` a *tall* :math:`n \times d_1` column-wise :math:`\denseskop` - with seed :math:`c`. We can easily generate an :math:`n\times d_2` column-wise - :math:`\denseskop` :math:`\mtx{S}_2` so that :math:`\mtx{S} = [\mtx{S}_1, \mtx{S}_2]` is the same - as the :math:`d \times (n_1 + n_2)` column-wise :math:`\denseskop` with seed :math:`c`. - - This arises we have a fixed data matrix :math:`\mtx{A}`, an initial sketch :math:`\mtx{B}_1 = \mtx{A} \mtx{S}_1`, - and we decide we want a larger sketch for statistical reasons. The - updated sketch :math:`\mtx{B} = \mtx{A}\mtx{S}` can be expressed as - - .. math:: - - \mtx{B} = \begin{bmatrix} \mtx{A} \mtx{S}_1 & \mtx{A} \mtx{S}_2 \end{bmatrix}. - -If :math:`(\mtx{S}_1, \mtx{S}_2, \mtx{S})` satisfy the assumptions above, then the final sketch -:math:`B` will be the same regardless of whether we computed the sketch in one -step or two steps. This is desirable for benchmarking and debugging -RandNLA algorithms where the sketch size is a tuning parameter. - - -Accomodate more data by adding short-axis vectors. -================================================== - - Suppose :math:`\mtx{S}_1` is a *tall* :math:`n_1 \times d` row-wise - :math:`\denseskop` with seed :math:`c`. It's easy to generate an :math:`n_2\times d` - row-wise :math:`\denseskop` :math:`\mtx{S}_2` in such a way that - :math:`\mtx{S} = [\mtx{S}_1; \mtx{S}_2]` is the same as the :math:`(n_1 + n_2) \times d` row-wise - :math:`\denseskop` with seed :math:`c`. - - This situation arises if we have an initial data matrix :math:`\mtx{A}_1`, an initial sketch - :math:`\mtx{B}_1 = \mtx{A}_1 \mtx{S}_1`, and then a new matrix :math:`\mtx{A}_2` arrives in such a way that we - want a sketch of :math:`\mtx{A} = [\mtx{A}_1, \mtx{A}_2]`. To compute :math:`\mtx{B} = \mtx{A}\mtx{S}`, we update :math:`\mtx{B}_1` - according to the formula - - .. math:: - - \mtx{B} = \begin{bmatrix} \mtx{A}_1 & \mtx{A}_2 \end{bmatrix} \begin{bmatrix} \mtx{S}_1 \\ \mtx{S}_2 \end{bmatrix} = \mtx{B}_1 + \mtx{A}_2 \mtx{S}_2. - - Now, suppose instead :math:`\mtx{S}_1` is a *wide* :math:`d \times m_1` column-wise - :math:`\denseskop` with seed :math:`c`. It's easy to generate a - :math:`d \times m_2` column-wise :math:`\denseskop` :math:`\mtx{S}_2` so that - :math:`\mtx{S} = [\mtx{S}_1, \mtx{S}_2]` is the same as the :math:`d \times (m_1 + m_2)` column-wise - :math:`\denseskop` with seed :math:`c`. - - This situation arises if we have an initial data matrix :math:`\mtx{A}_1`, an - initial sketch :math:`\mtx{B}_1 = \mtx{S}_1 \mtx{A}_1`, and then a new matrix :math:`\mtx{A}_2` arrives in - such a way that we want a sketch of :math:`A = [\mtx{A}_1; \mtx{A}_2]`. To compute :math:`\mtx{B} = \mtx{S}\mtx{A}`, - we update :math:`\mtx{B}_1` according to the formula - - .. math:: - - \mtx{B} = \begin{bmatrix} \mtx{S}_1 & \mtx{S}_2 \end{bmatrix} \begin{bmatrix} \mtx{A}_1 \\ \mtx{A}_2 \end{bmatrix} = \mtx{B}_1 + \mtx{S}_2 \mtx{A}_2. - -If :math:`(\mtx{S}_1, \mtx{S}_2, \mtx{S})` satisfy the assumptions above, then :math:`\mtx{B}` will be the -same as though we started with all of :math:`\mtx{A}` from the very beginning. This -is useful for benchmarking and debugging RandNLA algorithms that involve -operating on data matrices that increase in size over some number of iterations. - - -Porque no los dos? Work with giant, implicit operators. -========================================================== -