Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sbromberger/fix templates #263

Closed
wants to merge 53 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
41019f0
Updating Github Actions to run on pull requests against our new dev b…
steiltre Apr 16, 2024
8966113
Actions Update to GCC and Ubuntu Versions (#202)
steiltre Apr 19, 2024
25050b2
Change supported arrow version (v15 <= Arrow <= v16). (#204)
KIwabuchi Apr 29, 2024
444bcdc
Rewrites ygm::container::disjoint_set to use union by rank with path-…
steiltre May 10, 2024
a359661
Changes MPI_Waitsome to MPI_Testsome loop and reworks ygm::comm::hand…
steiltre May 18, 2024
6a6f4ae
Create cereal_boost_container.hpp (#209)
KIwabuchi May 28, 2024
2ef311a
Add Test for Cerealizing Boost Vector (#210)
KIwabuchi Jun 9, 2024
7abf888
Parquet-Tools (rowcount, schema, and dump) (#203)
KIwabuchi Jun 13, 2024
3c16820
Checks parent has not changed before updating in ygm::container::disj…
steiltre Jun 17, 2024
a9371e2
Feature/parquet tools (#213)
KIwabuchi Jun 21, 2024
55e5df2
Add Prototype for Doxygen & RTD (#217)
KIwabuchi Jul 27, 2024
1c16ab2
Feature/containers (#216)
rogerpearce Jul 27, 2024
caa0d22
(RTD) Bugfix: specify os and python versions (#219)
KIwabuchi Jul 29, 2024
c0e3b88
Feature/array overhaul (#220)
steiltre Jul 29, 2024
a10a0a5
Adds ygm::container::array::sort() (#221)
steiltre Jul 30, 2024
0c08e40
Fixes bug in block_partitioner's large block size affecting small siz…
steiltre Jul 30, 2024
4df0153
Fixes bug in comm::barrier() that can cause asyncs to be spawned from…
steiltre Aug 1, 2024
d82d41c
Fixes bug in block_partitioner (#225)
steiltre Aug 1, 2024
71bdcad
Getting started documentation and reworks README (#226)
steiltre Aug 1, 2024
73df3fc
Makes dereferencing a const ygm_ptr<T> return a const pointer to non-…
steiltre Aug 1, 2024
2867752
CSV Parser Headers (#214)
steiltre Aug 1, 2024
9e5da3d
Updated container constructors. (#227)
rogerpearce Aug 2, 2024
6f0c787
Added prefix YGM_ to ASSERT macros. (#228)
rogerpearce Aug 2, 2024
6da544a
Hotfix (#230)
rogerpearce Aug 2, 2024
75e311b
Feature/container docs (#229)
steiltre Aug 2, 2024
f84714d
Renamed key_gather to gather_keys. (#231)
rogerpearce Aug 2, 2024
32fcffa
Adds batch erase functionality to ygm::container::multiset (#232)
steiltre Aug 3, 2024
2b6ae17
Added gather_topk, separated base_iteration. (#233)
rogerpearce Aug 3, 2024
f4067e5
Feature/batch erase specializations (#234)
steiltre Aug 3, 2024
88e26f5
hotfix counting_set constructors (#235)
rogerpearce Aug 3, 2024
6e09f32
Adds back in test_collective (#236)
steiltre Aug 3, 2024
a4cba5a
Adds batch erase for associative containers from value containers of …
steiltre Aug 3, 2024
c15e50f
Fixed reduce() and added test. (#237)
rogerpearce Aug 3, 2024
27a7acd
renamed transform & added test. (#239)
rogerpearce Aug 3, 2024
7b74bae
added comm() (#240)
rogerpearce Aug 5, 2024
0216aba
Adds check for async lambdas attempting to capture (#243)
steiltre Aug 11, 2024
10c6778
Captures lambdas for execution inside container wrapper lambdas (#244)
steiltre Aug 12, 2024
46feb90
Removing all clever tricks and introducing duplicate code so doxygen …
steiltre Aug 29, 2024
d5be689
Updating readthedocs to use Ubuntu 24.04 when building docs to get ne…
steiltre Aug 30, 2024
9a72722
Adds missing is_standard_layout check to YGM_CHECK_ASYNC_LAMBDA_COMPL…
steiltre Aug 30, 2024
24ce1d8
Feature/reducing adapter update (#253)
steiltre Sep 1, 2024
53146e5
Feature/container functor visit (#255)
steiltre Sep 1, 2024
0a32df9
Calls derived_this->for_all() in ygm::container::base_iteration::gath…
steiltre Sep 9, 2024
6521bb0
Adds ability to read node-local files using prefix 'local://' inside …
steiltre Sep 9, 2024
c003a8f
Adds async_reduce functionality to ygm::container::array (#259)
steiltre Sep 13, 2024
4b86062
Adds additional tests for transform() in combination with filter() (#…
steiltre Sep 13, 2024
21a90ef
fixing likely typo in communicator (#261)
ryan-dozier Sep 16, 2024
0f83660
Lambda Compliance Macro Cleanup (#246)
steiltre Sep 16, 2024
8d51dd3
Find latest arrow always (#248)
KIwabuchi Sep 16, 2024
41e4681
Find or Install Parquet (#249)
KIwabuchi Sep 16, 2024
065a31f
Adds missing comm() function to disjoint_set (#262)
steiltre Sep 17, 2024
5941129
simplified using auto
sbromberger Sep 17, 2024
b418108
fixed template redefinition errors in build
sbromberger Sep 24, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 14 additions & 11 deletions .github/workflows/ci-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: CI Test

on:
pull_request:
branches: [ master, develop ]
branches: [ master, '**-dev' ]
push:
branches: [ 'feature/**', 'hotfix/**']

Expand All @@ -14,17 +14,17 @@ jobs:
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest, ubuntu-20.04]
gcc-version: [8, 9, 10]
os: [ubuntu-latest, ubuntu-22.04]
gcc-version: [11, 12]
mpi-type: [mpich, openmpi]
exclude:
- os: ubuntu-latest
gcc-version: 8
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Cache boost
uses: actions/cache@v3
uses: actions/cache@v4
id: cache-boost
with:
path: "~/boost_1_77_0"
Expand All @@ -36,12 +36,12 @@ jobs:
wget --no-verbose https://boostorg.jfrog.io/artifactory/main/release/1.77.0/source/boost_1_77_0.tar.bz2
tar -xjf boost_1_77_0.tar.bz2
- name: Install Apache Arrow
if: ${{ matrix.os == 'ubuntu-latest'}} # Skip on ubuntu-20.04 due to pyarrow's ABI incompatibility issue
run: |
cd ~
wget https://apache.jfrog.io/artifactory/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb
sudo apt-get install ./apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb
sudo apt-get update
sudo apt-get install libarrow-dev libparquet-dev
python -m pip install pyarrow==16.1.*
PIP_PYARROW_ROOT=$(python -c "import pyarrow as pa; print(pa.get_library_dirs()[0])")
echo "PIP_PYARROW_ROOT=${PIP_PYARROW_ROOT}" >> $GITHUB_ENV
echo "ENABLE_PARQUET_TEST=true" >> $GITHUB_ENV
- name: Install mpich
if: matrix.mpi-type == 'mpich'
run: sudo apt-get install mpich
Expand All @@ -58,7 +58,10 @@ jobs:
g++-${{ matrix.gcc-version }} --version
mkdir build
cd build
cmake ../ -DCMAKE_BUILD_TYPE=${{ env.BUILD_TYPE }} -DCMAKE_CXX_COMPILER=g++-${{ matrix.gcc-version }} -DBOOST_ROOT=~/boost_1_77_0 -DYGM_REQUIRE_ARROW=ON
if [ "$ENABLE_PARQUET_TEST" ]; then
ARROW_CMAKE_OPTION="-DPIP_PYARROW_ROOT=$PIP_PYARROW_ROOT -DYGM_REQUIRE_ARROW_PARQUET=ON"
fi
cmake ../ -DCMAKE_BUILD_TYPE=${{ env.BUILD_TYPE }} -DCMAKE_CXX_COMPILER=g++-${{ matrix.gcc-version }} -DBOOST_ROOT=~/boost_1_77_0 ${ARROW_CMAKE_OPTION}
make -j
- name: Make test (mpich)
if: matrix.mpi-type == 'mpich'
Expand Down
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
*~
*#*
build*
.vscode*
.vscode*
.idea*
.cache/
16 changes: 16 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
version: 2

# Set OS and Python versions
build:
os: ubuntu-24.04
tools:
python: "3.12"

# Change the location of the requirements file
python:
install:
- requirements: docs/rtd/requirements.txt

# Change the location of the configuration file
sphinx:
configuration: docs/rtd/conf.py
71 changes: 21 additions & 50 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ project(
LANGUAGES CXX
)

list(APPEND CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/cmake")

if (CMAKE_PROJECT_NAME STREQUAL PROJECT_NAME)
set(YGM_MAIN_PROJECT ON)
endif ()
Expand All @@ -32,13 +34,19 @@ if (YGM_MAIN_PROJECT)
# main CMakeLists.
include(CTest)

# Docs only available if this is the main app
find_package(Doxygen)
if (Doxygen_FOUND)
# add_subdirectory(docs)
else ()
message(STATUS "Doxygen not found, not building docs")
endif ()
#
# Doxygen and Read the Docs (RTD) Documentation
#
option(YGM_DOXYGEN "Add targets for generating documentations" OFF)
# Generate documentation for Read the Docs
option(YGM_RTD_ONLY
"Run CMake for only generating documentations for Read the Docs" OFF)
if (YGM_DOXYGEN OR YGM_RTD_ONLY)
add_subdirectory(docs)
if (YGM_RTD_ONLY)
return()
endif ()
endif()
endif ()

# Require out-of-source builds
Expand Down Expand Up @@ -73,7 +81,7 @@ if (NOT cereal_FOUND)
FetchContent_Declare(
cereal
GIT_REPOSITORY https://github.com/USCiLab/cereal.git
GIT_TAG af0700efb25e7dc7af637b9e6f970dbb94813bff
GIT_TAG v1.3.2
)
FetchContent_GetProperties(cereal)
if (cereal_POPULATED)
Expand Down Expand Up @@ -134,54 +142,16 @@ endif ()
# Arrow
#
#
find_package(Arrow 8.0 QUIET)
if (NOT Arrow_FOUND)
find_package(Arrow 9.0 QUIET)
endif()
if (NOT Arrow_FOUND)
find_package(Arrow 10.0 QUIET)
endif()
if (NOT Arrow_FOUND)
find_package(Arrow 11.0 QUIET)
endif()
if (NOT Arrow_FOUND)
find_package(Arrow 12.0 QUIET)
endif()
if (NOT Arrow_FOUND)
find_package(Arrow 13.0 QUIET)
endif()
if (NOT Arrow_FOUND)
find_package(Arrow 14.0 QUIET)
endif()
if (NOT Arrow_FOUND)
find_package(Arrow 15.0 QUIET)
endif()
if (Arrow_FOUND)
message(STATUS ${PROJECT_NAME} " found Arrow ")
message(STATUS "Arrow version: ${ARROW_VERSION}")
message(STATUS "Arrow SO version: ${ARROW_FULL_SO_VERSION}")
set(ARROW_CMAKE_DIR ${Arrow_DIR})
find_package(Parquet PATHS ${ARROW_CMAKE_DIR})
if (Parquet_FOUND)
message(STATUS ${PROJECT_NAME} " found Parquet ")
message(STATUS "Parquet version: ${PARQUET_VERSION}")
message(STATUS "Parquet SO version: ${PARQUET_FULL_SO_VERSION}")
else ()
message(WARNING ${PROJECT_NAME} " did not find Parquet. Building without Parquet.")
endif ()
else ()
message(WARNING ${PROJECT_NAME} " did not find Arrow >= 8.0. Building without Arrow.")
if (YGM_REQUIRE_ARROW)
message(FATAL_ERROR "YGM configured to require Arrow, but Arrow could not be found")
endif ()
endif ()
include(FindArrowParquet)
option(YGM_REQUIRE_ARROW_PARQUET "YGM requires Apache Arrow Parquet." OFF)
find_or_install_arrow_parquet()

#
# Create the YGM target library
#
add_library(ygm INTERFACE)
add_library(ygm::ygm ALIAS ygm)
target_compile_features(ygm INTERFACE cxx_std_17)
target_compile_features(ygm INTERFACE cxx_std_20)
target_include_directories(
ygm INTERFACE $<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/include>
$<INSTALL_INTERFACE:include>
Expand Down Expand Up @@ -255,4 +225,5 @@ if (YGM_MAIN_PROJECT)
add_subdirectory(test)
# Example codes are here.
add_subdirectory(examples)
add_subdirectory(tools)
endif ()
133 changes: 57 additions & 76 deletions Readme.md
Original file line number Diff line number Diff line change
@@ -1,33 +1,70 @@
# What is YGM?

YGM is an asynchronous communication library designed for irregular communication patterns. It is built on a
communicator abstraction, much like MPI, but communication is handled asynchronously and is initiated by senders without
any interaction with receivers. YGM features
* **Message buffering** - Increases application throughput.
* **Fire-and-Forget RPC Semantics** - A sender provides the function and function arguments for execution on a specified
destination rank through an `async` call. This function will complete on the destination rank at an unspecified time
in the future, but YGM does not explicitly make the sender aware of this completion.
* **Storage Containers** - YGM provides a collection of distributed storage containers with asynchronous
interfaces, used for many common distributed memory operations. Containers are designed to partition data, allowing
insertions to occur from any rank. Data is accessed through collective `for_all` operations that execute a user-provided
function on every stored object, or, when a particular piece of data's location is known, `visit`-type operations that
perform a user-provided function only on the desired data. These containers are found
[here](/include/ygm/container/).

# Getting Started
## What is YGM?

YGM is an asynchronous communication library written in C++ and designed for high-performance computing (HPC) use cases featuring
irregular communication patterns. YGM includes a collection of
distributed-memory storage containers designed to express common algorithmic and data-munging tasks. These containers
automatically partition data, allowing insertions and, with most containers, processing of individual elements to be
initiated from any runninng YGM process.

Underlying YGM's containers is a communicator abstraction. This communicator asynchronously sends messages spawned by
senders with receivers needing no knowledge of incoming messages prior to their arrival. YGM communications take the
form of *active messages*; each message contains a function object to execute (often in the form of C++ lambdas), data
and/or pointers to data for this function to execute on, and a destination process for the message to be executed at.

YGM also includes a set of I/O primitives for parsing collections of input documents in parallel as independent lines of
text and streaming output lines to
large numbers of destination files. Current parsing functionality supports reading input as CSV, ndjson, and
unstructured lines of data.

## General YGM Operations

YGM is built on its ability to communicate active messages asynchronously between running processes. This does not
capture every operation that can be useful, for instance collective operations are still widely needed. YGM uses
prefixes on function names to distinguish their behaviors in terms of the processes involved. These prefixes are:
* `async_`: Asynchronous operation initiated on a single process. The execution of the underlying function may
occur on a remote process.
* `local_`: Function performs only local operations on data of the current process. In uses within YGM containers
with partitioning schemes that determine item ownership, care must be taken to ensure the process a `local_`
operation is called from aligns with the item's owner. For instance, calling `ygm::container::map::local_insert`
will store an item on the process where the call is made, but the `ygm::container::map` may not be able to look
up this location if it is on the wrong process.
* No Prefix: Collective operation that must be called from all processes.

The primary workhorse functions in YGM fall into the two categories of `async_` and `for_all` operations. In an
`async_` operation, a lambda is asynchronously sent to a (potentially) remote process for execution. In many cases
with YGM containers, the lambda being executed is not provided by the user and is instead part of the function itself,
e.g. `async_insert` calls on most containers. A `for_all` operation is a collective operation in which a lambda is
executed locally on every process while iterating over all locally held items of some YGM object. The items iterated
over can be items in a YGM container, items coming from a map, filter, or flatten applied to a container, or all lines
in a collection of files in a YGM I/O parser.

### Lambda Capture Rules
Certain `async_` and `for_all` operations require users to provide lambdas as part of their executions. The lambdas
that can be accepted by these two classes of functions follow different rules pertaining to the capturing of variables:
* `async_` calls cannot capture (most) variables in lambdas. Variables necessary for lambda execution must be
provided as arguments to the `async_` call. In the event that the data for the lambda resides on the remote
process the lambda will execute on, a `ygm::ygm_ptr` should be passed as an argument to the `async_`.
* `for_all` calls assume lambdas take only the arguments inherently provided by the YGM object being iterated over.
All other necessary variables *must* be captured. The types of arguments provided to the lambda can be identified
by the `for_all_args` type within the YGM object.

These differences in behavior arise from the distinction that `async_` lambdas may execute on a remote process, while
`for_all` lambdas are guaranteed to execute locally to a process. In the case of `async_` operations, the lambda and
all arguments must be serialized for communication, but C++ does not provide a method for inspection of variables
captured in the closure of a lambda. In the case of `for_all` operations, the execution is equivalent to calling
[`std::for_each`](https://en.cppreference.com/w/cpp/algorithm/for_each) on entire collection of items held locally.

## Requirements
* C++17 - GCC versions 8, 9 and 10 are tested. Your mileage may vary with other compilers.
* C++20 - GCC versions 11 and 12 are tested. Your mileage may vary with other compilers.
* [Cereal](https://github.com/USCiLab/cereal) - C++ serialization library
* MPI
* Optionally, Boost 1.77 to enable Boost.JSON support.


## Using YGM with CMake
YGM is a header-only library that is easy to incorporate into a project through CMake. Adding the following to
CMakeLists.txt will install YGM and its dependencies as part of your project:
```
set(DESIRED_YGM_VERSION 0.4)
set(DESIRED_YGM_VERSION 0.6)
find_package(ygm ${DESIRED_YGM_VERSION} CONFIG)
if (NOT ygm_FOUND)
FetchContent_Declare(
Expand All @@ -52,62 +89,6 @@ else ()
endif ()
```

# Anatomy of a YGM Program
Here we will walk through a basic "hello world" YGM program. The [examples directory](/examples/) contains several other
examples, including many using YGM's storage containers.

To begin, headers for a YGM communicator are needed
``` C++
#include <ygm/comm.hpp>
```

At the beginning of the program, a YGM communicator must be constructed. It will be given `argc` and `argv` like
`MPI_Init`, and it has an optional third argument that specifies the aggregate size (in bytes) allowed for all send
buffers before YGM begins flushing sends. Here, we will make a buffer with 32MB of aggregate send buffer space.
``` C++
ygm::comm world(&argc, &argv, 32*1024*1024);
```

Next, we need a lambda to send through YGM. We'll do a simple hello\_world type of lambda.
``` C++
auto hello_world_lambda = [](const std::string &name) {
std::cout << "Hello " << name << std::endl;
};
```

Finally, we use this lambda inside of our `async` calls. In this case, we will have rank 0 send a message to rank 1,
telling it to greet the world
``` C++
if (world.rank0()) {
world.async(1, hello_world_lambda, std::string("world"));
}
```

The full, compilable version of this example is found [here](/examples/hello_world.cpp). Running it prints a single
"Hello world".

# Potential Pitfalls

## Allowed Lambdas
There are two distinct classes of lambdas that can be given to YGM: *remote lambdas* and *local lambdas*, each of which
has different requirements.

### Remote Lambdas
A *remote lambda* is any lambda that may potentially be executed on a different rank. These lambdas are identified as
being those given to a `ygm::comm` or any of the storage containers through a function prefixed by `async_`.

The defining feature of remote lambdas is they **must not** capture any variables; all variables must be provided as
arguments. This limitation is due to the lack of
ability for YGM to inspect and extract these arguments when serializing messages to be sent to other ranks.

### Local Lambdas
A *local lambda* is any lambda that is guaranteed not to be sent to a remote rank. These lambdas are identified as being
those given to a `for_all` operation on a storage container.

The defining feature of local lambdas is that all arguments besides what is stored in the container must be captured.
Internally, these lambdas may be given to a [`std::for_each`](https://en.cppreference.com/w/cpp/algorithm/for_each) that
iterates over the container's elements stored locally on each rank.

# License
YGM is distributed under the MIT license.

Expand Down
Loading