Skip to content

Commit

Permalink
set next_state on construction, web documentation, more stat tests an…
Browse files Browse the repository at this point in the history
…d testing infrastructre, API changes for DenseSkOp and SparseSkOp, more C++20 concepts (#104)

This PR is huge. I'm going to merge it even though there are outstanding
documentation TODOs, as indicated in Issue #107.

## Changes in RandBLAS proper

Major new identifiers
* SketchingDistribution and SketchingOperator C++20 concepts. 
* isometry_scale_factor (overloaded for DenseSkOp and SparseSkOp),
returns the scale that should canonically be applied to the full
sketching operator assuming its entries are sampled in RandBLAS'
standard way (uniform $\sqrt{3}[-1,1]$ or Gaussian(0, 1) for dense, or
length-vec_nnz vectors of iid Rademachers for sparse).
* dense::compute_next_state and sparse::compute_next_state, for
computing the appropriate value of ``S.next_state`` in O(1) time. This
enables defining sequences of sketching operators with the same key
without having to explicitly generate the operators.

Minor new identifiers
* For dense sketching operators, submatrix_as_blackbox accepts a
sketching operator, row and column counts, and row and column offsets,
and returns a DenseSkOp with a BlackBox distribution whose entries match
those of the appropriate submatrix of the given sketching operator.
* safe_int_product (in RandBLAS/base.hh) checks for overflows. This is
useful when computing potentially-large index offsets for implicitly
defined matrices.
* util::omatcopy for general stride changes
* util::overwrite_triangle, for preparing a triangular matrix to be
treated as a general matrix in gemm or trmm.
* A new value MajorAxis::Unknown in the MajorAxis enum, for use with
BlackBox distribution families.
* An overload of ``repeated_fisher_yates`` that skips writes to ``vals``
and ``idxs_minor`` and that has more intuitive argument names.

Other changes
* The default scalar distribution for DenseSkOp objects is now Gaussian
instead of uniform.
* The default scaling of uniform sketching operators is
$[-\sqrt{3},\sqrt{3}]$, since this choice provides for unit variance.
* rewrite of dense::fill_dense_submat_impl, needed in order to
efficiently implement dense::compute_next_state.
* In SparseSkOp, the typename ``RNG_t`` has been removed, a typename
``state_t`` has been added (equal to ``RNGState<RNG>``) and the typename
``T_t`` typename has been renamed to ``scalar_t``. Similar changes have
been made to ``DenseSkOp`` for consistency across the two APIs.
* Changed the verbose overload of``fill_dense`` for writing an implicit
submatrix to a given buffer. This function now requires a layout
parameter as its first argument. It performs an out-of-place change of
layout if the requested layout is different from the natural layout
implied by the distribution object.
* Moved several function definitions for DenseSkOp and SparseSkOp into
the struct definitions, rather than only declaring signatures in the
struct definition and implementing the functions outside the struct.

## Changes in documentation

* Added ``sample_indices_iid_uniform``, ``sample_indicies_iid``, and
``weights_to_cdf`` to a new page in the API reference.
* Added a section to the tutorial on updating sketches.

## Changes in tests

New / significantly changed tests
* test_continuous.cc: Kolmogorov-Smirnov tests for distributional
correctness of uniform [-sqrt(3),sqrt(3)] and Gaussian sampling. This
includes a helper function for exactly computing the KS test statistic
for distributions whose CDF is continuous, which has some surprising
subtleties.
* test_r123.cc: a significant clean up of (and slight expansion of)
test_r123_kat.cc.
* test_distortion.cc: statistical tests for subspace embedding
distortion of dense sketching operators. This required a bunch of new
infrastructure, described below.

``handrolled_lapack.hh`` implements algorithms that would generally be
considered "LAPACK-level," although the algorithms won't be found in
LAPACK itself.
* cholesky (sequential and blocked)
* cholesky qr (with a flag to run a second pass)
* QR based on block classic Gram-Schmidt, where blocks are handled by
cholesky qr2.
* QR with a two-pass version of blocked classic Gram-Schmidt. (Not
equivalent to modified Gram-Schmidt.)
* A function to check if the outer-most Gershgorin discs for a positive
definite matrix have converged to within some specified relative radius.
* cholesky iteration to compute all eigenvalues of a positive definite
matrix (with termination criteria based only on extremal eigenvalues).
This is actually prohibitively slow except when the matrix is very
small. But it works, and it's tested, so might as well keep it around.
* power method to compute the extremal eigenvalues of a positive
definite matrix. The number of iterations is chosen based on a
well-known convergence result for the power method in the absence of a
spectral gap. The smallest eigenvalue is obtained by Cholesky
decomposing the matrix, explicitly forming its inverse, and then running
the standard power method. This leads to faster performance for matrix
dimensions and tolerances that we need in statistical tests.
  • Loading branch information
rileyjmurray authored Aug 22, 2024
1 parent 826c9bb commit 614c1c2
Show file tree
Hide file tree
Showing 36 changed files with 2,296 additions and 1,002 deletions.
108 changes: 90 additions & 18 deletions RandBLAS/base.hh
Original file line number Diff line number Diff line change
Expand Up @@ -118,14 +118,35 @@ template<typename T>
concept SignedInteger = (std::numeric_limits<T>::is_signed && std::numeric_limits<T>::is_integer);


template <SignedInteger TI, SignedInteger TO = int64_t>
inline TO safe_int_product(TI a, TI b) {
if (a == 0 || b == 0) {
return 0;
}
TO c = a * b;
TO b_check = c / a;
TO a_check = c / b;
if ((a_check != a) || (b_check != b)) {
std::stringstream s;
s << "Overflow when multiplying a (=" << a << ") and b(=" << b << "), which resulted in " << c << ".\n";
throw std::overflow_error(s.str());
}
return c;
}


enum class MajorAxis : char {
// ---------------------------------------------------------------------------
/// short-axis vectors (cols of a wide matrix, rows of a tall matrix)
Short = 'S',

// ---------------------------------------------------------------------------
/// long-axis vectors (rows of a wide matrix, cols of a tall matrix)
Long = 'L'
Long = 'L',

// ---------------------------------------------------------------------------
/// Undefined (used when row-major vs column-major must be explicit)
Undefined = 'U'
};


Expand All @@ -134,28 +155,38 @@ enum class MajorAxis : char {
* the counter and the key. The arrays' types are statically sized, small
* (typically of length 2 or 4), and can be distinct from one another.
*
* @tparam RNG A CBRNG type in defined in Random123. We've found that Philox-based
* CBRNGs work best for our purposes. Strictly speaking, we allow all Random123 CBRNGs
* besides those based on AES.
* The template parameter RNG is a CBRNG type in defined in Random123. We've found
* that Philox-based CBRNGs work best for our purposes, but we also support Threefry-based CBRNGS.
*/
template <typename RNG = r123::Philox4x32>
struct RNGState
{
using generator = RNG;
// The unsigned integer type used in this RNGState's counter array.
using ctr_uint_type = typename RNG::ctr_type::value_type;
/// @brief The unsigned integer type used in this RNGState's key array.
/// This is typically std::uint32_t, but it can be std::uint64_t.
using key_uint_type = typename RNG::key_type::value_type;
// An array type defined in Random123.

using ctr_type = typename RNG::ctr_type;
// An array type defined in Random123.
// ^ An array type defined in Random123.
using key_type = typename RNG::key_type;
// ^ An array type defined in Random123.
using ctr_uint = typename RNG::ctr_type::value_type;
// ^ The unsigned integer type used in this RNGState's counter array.

/// -------------------------------------------------------------------
/// @brief The unsigned integer type used in this RNGState's key array.
/// This is typically std::uint32_t, but it can be std::uint64_t.
using key_uint = typename RNG::key_type::value_type;


const static int len_c = RNG::ctr_type::static_size;
const static int len_k = RNG::key_type::static_size;
typename RNG::ctr_type counter; ///< This RNGState's counter array.
typename RNG::key_type key; ///< This RNGState's key array.
typename RNG::ctr_type counter;
// ^ This RNGState's counter array.

/// ------------------------------------------------------------------
/// This RNGState's key array. If you want to manually advance the key
/// by an integer increment of size "step," then you do so by calling
/// this->key.incr(step).
typename RNG::key_type key;


/// Initialize the counter and key arrays to all zeros.
RNGState() : counter{{0}}, key(key_type{{}}) {}
Expand All @@ -171,7 +202,7 @@ struct RNGState

/// Initialize the counter array to all zeros. Initialize the key array to have first
/// element equal to k and all other elements equal to zero.
RNGState(key_uint_type k) : counter{{0}}, key{{k}} {}
RNGState(key_uint k) : counter{{0}}, key{{k}} {}

~RNGState() {};

Expand All @@ -187,16 +218,16 @@ template <typename RNG>
RNGState<RNG>::RNGState(
const RNGState<RNG> &s
) {
std::memcpy(this->counter.v, s.counter.v, this->len_c * sizeof(ctr_uint_type));
std::memcpy(this->key.v, s.key.v, this->len_k * sizeof(key_uint_type));
std::memcpy(this->counter.v, s.counter.v, this->len_c * sizeof(ctr_uint));
std::memcpy(this->key.v, s.key.v, this->len_k * sizeof(key_uint));
}

template <typename RNG>
RNGState<RNG> &RNGState<RNG>::operator=(
const RNGState &s
) {
std::memcpy(this->counter.v, s.counter.v, this->len_c * sizeof(ctr_uint_type));
std::memcpy(this->key.v, s.key.v, this->len_k * sizeof(key_uint_type));
std::memcpy(this->counter.v, s.counter.v, this->len_c * sizeof(ctr_uint));
std::memcpy(this->key.v, s.key.v, this->len_k * sizeof(key_uint));
return *this;
}

Expand All @@ -219,4 +250,45 @@ std::ostream &operator<<(
return out;
}

// =============================================================================
/// @verbatim embed:rst:leading-slashes
///
/// .. NOTE: \ttt expands to \texttt (its definition is given in an rst file)
///
/// Words. Hello!
///
/// @endverbatim
template<typename SkDist>
concept SketchingDistribution = requires(SkDist D) {
{ D.n_rows } -> std::convertible_to<const int64_t>;
{ D.n_cols } -> std::convertible_to<const int64_t>;
{ D.major_axis } -> std::convertible_to<const MajorAxis>;
};

// =============================================================================
/// \fn isometry_scale_factor(SkDist D)
/// @verbatim embed:rst:leading-slashes
/// Words here ...
/// @endverbatim
template <typename T, SketchingDistribution SkDist>
inline T isometry_scale_factor(SkDist D);


// =============================================================================
/// @verbatim embed:rst:leading-slashes
///
/// .. NOTE: \ttt expands to \texttt (its definition is given in an rst file)
///
/// Words. Hello!
///
/// @endverbatim
template<typename SkOp>
concept SketchingOperator = requires(SkOp S) {
{ S.n_rows } -> std::convertible_to<const int64_t>;
{ S.n_cols } -> std::convertible_to<const int64_t>;
{ S.seed_state } -> std::convertible_to<const typename SkOp::state_t>;
{ S.seed_state } -> std::convertible_to<const typename SkOp::state_t>;
};


} // end namespace RandBLAS::base
Loading

0 comments on commit 614c1c2

Please sign in to comment.