Releases · GlobalArrays/ga

03 Nov 21:30

bjpalmer

v5.8.2

f7a3104

v5.8.2 Latest

Latest

[5.8.2]

Known Bugs
- The MPI RMA port still shows spotty behavior and many tests in the test suite
  are failing for many MPI implementations. Currently, the Open MPI
  implementation in version 4.1.4 is working well and all tests are passing.
Added
- Setting ARMCI_VERBOSE=1 at runtime will also dump configuration details for
  ComEx runtime
Changed
- Updated compiler settings in CMake build if Fujitsu compilers are detected
Fixed
- Fixed gcc toolchain checks in CMake for clang build
- Fixed tiled arrays so that they work with restricted arrays and fixed some
  additional bugs in block cyclic distributions
- Removed several memory leaks
- Modified check on the number of processors that was being performed in the GA
  create process. Previously this check was failing since it was possible that
  the check was being performed before a process group had been assigned to
  global array.
- Fixed some issues with hidden string length argument in fortran interface

Assets 3

14 Dec 15:48

bjpalmer

v5.8.1

3142f8f

v5.8.1

Known Bugs
Added
- Added support in MA for CUDA managed memory. Provided by Jeff Hammond.
- Added a GA_Deallocate function that deallocates memory but leaves GA in
  place. GA_Allocate can be called later on the handle. This can be used for
  memory management.
Changed
Fixed
- Slurm conflict for free_buf symbol in DRA library. Fixed by Michael Klemm.
- Deallocate GA_MPI_World_comm_dup in GA_Terminate.
- Dependency of CMake build on C++ is configurable.
- Improved CMake integration of linear algebra libraries

Assets 3

02 Nov 20:08

bjpalmer

v5.8

25a92a9

v5.8

Known Bugs
- The MPI RMA port remains unreliable for many MPI implementations. Open MPI
  still reports many failures in the test suit. Intel MPI is better but still
  reports several failures. It is recommended to use the latest MPI
  implementations available.
Added
- Version function that can be used to report the current version, subversion
  and patch numbers of the current release
- Overlay option for creating new GAs on top of existing GAs
- The number of progress ranks per node in the progress ranks runtime is now
  configurable
- Functions for duplicating process groups and returning a process group that
  only contains the calling process
- 64-bit versions of block-cyclic data distribution functions to
  C interface
- Non-blocking test function
- Read-only property based on caching
- GA name can be recovered from handle
- Added profiling capabilities to the GA branch that automatically generates
  a log file in the running directory. This can be controlled with GAW_FILE_PREFIX
  environment variable to add a prefix for the log files and the GAW_FMT
  environment variable to create a CSV format or human readable format. The
  default format is human readable.
  - For autotools, add --enable-profile=1 in the configure line
  - For CMake add -DENABLE_PROFILING=ON
Changed
- Non-blocking handle management was completely revamped. This simplifies
  implementation and removes some bugs. The number of outstanding non-blocking
  calls was increased to 256
- Modified internal function that computes rank of processors on the world
  communicator so that it does not use the MPI_Comm_translate_ranks function.
  This function is implemented with a loop that scales as the square of the
  number of processors and is very slow at large processor counts
- modified internal iterators so that block cyclic data distributions work on
  processor groups
- Improved CMake build
- Modified ga_print_distribution so that it works on block-cyclic data
  distributions
Fixed
- Fixed a non-blocking error that was showing up in nbtest.x

Assets 3

28 Feb 20:18

bjpalmer

v5.7.2

2585469

v5.7.2

Fixes
- Accidently set strided accumulates to use MPI Datatypes in v5.7.1. Turned this off.

Assets 3

28 Feb 19:32

bjpalmer

v5.7.1

d7d8c57

v5.7.1

Added
- added NOUSE_MMAP for 32bit linux
Fixed
- pgcc: need to rename f77 object to cfortran_test.o to avoid pgcc overwriting conftest.o during linking
  http://www.nwchem-sw.org/index.php/Special:AWCforum/st/id2788
- fixes for ga_diag_std_seq 32-bit integer interface
- fix for MKL error "PDSTEDC parameter number 10 had an illegal value" http://www.nwchem-sw.org/index.php/Special:AWCforum/st/id2660
- fix for MPI-2 deprecated MPI_Type_struct and MPI_Errhandler_set
Closed Issues
- [#157] add -fallow-argument-mismatch for gfortran 10

Assets 3

30 Mar 22:41

jeffdaily

v5.7

1537981

v5.7

Known Bugs
- Some combinations of MPI implementations with the MPI RMA and PR
  ports fail. Recommended to use latest MPI implementations available.
Added
- Tiled data layout
- Read-only property type using replication across SMP nodes
Changed
- GA is now thread safe
- MPI3 implementation based on MPI RMA now uses data types in MPI
  calls by default. This is higher performing but not as reliable as
  using multiple contiguous data transfers. The build can be
  configured to use contiguous transfers if data types are not working
  for your MPI implementation.
- ComEx MPI-PR now uses MPI data types in strided put and get calls
  by default. To enable the old packed behavior, set the following
  environment variables to 0.
  - COMEX_ENABLE_PUT_DATATYPE
  - COMEX_ENABLE_GET_DATATYPE
  Additionally, the original packing implementation is faster for smaller
  messages. Two new environment variables control at which point the MPI
  data types are used.
  - COMEX_PUT_DATATYPE_THRESHOLD. Default 8192.
  - COMEX_GET_DATATYPE_THRESHOLD. Default 8192.
Fixed
- Message sizes exceeding 2GB now work correctly
- Mirrored Arrays now distributes data across SMP nodes for
  ComEx-based runtimes
- Matrix multiply works for non-standard data layouts (may not be
  performant)
Closed Issues
- [#48] Message sizes exceeding 2GB may not work correctly

Assets 3

30 Mar 17:12

jeffdaily

v5.6.5

0f3fe27

v5.6.5

Known Bugs
- [#48] Message sizes exceeding 2GB may not work correctly
Added
- Environment variables to control internal ComEx MPI-PR settings
  - COMEX_MAX_NB_OUTSTANDING. Default 8.
    The maximum number of concurrent non-blocking operations.
  - COMEX_STATIC_BUFFER_SIZE. Default 2097152 bytes.
    Some ComEx operations require a temporary buffer. Any message larger than this size will dynamically allocate and free a new buffer to hold the larger message.
  - COMEX_EAGER_THRESHOLD. Default -1.
    Small messages can be sent as part of other internal ComEx operations. Recommended to set this to less than or equal to the corresponding MPI eager/rendezvous threshold cutoff.
  - COMEX_ENABLE_PUT_SELF. Default 1 (on). Contiguous put will use memcpy when target is same as originator.
  - COMEX_ENABLE_GET_SELF. Default 1 (on). Contiguous get will use memcpy when target is same as originator.
  - COMEX_ENABLE_ACC_SELF. Default 1 (on). Contiguous acc will use memcpy when target is same as originator.
  - COMEX_ENABLE_PUT_SMP. Default 1 (on). Contiguous put will use memcpy when target is on the same host via shared memory.
  - COMEX_ENABLE_GET_SMP. Default 1 (on). Contiguous get will use memcpy when target is on the same host via shared memory.
  - COMEX_ENABLE_ACC_SMP. Default 1 (on). Contiguous acc will use memcpy when target is on the same host via shared memory.
  - COMEX_ENABLE_PUT_PACKED. Default 1 (on). Strided put will pack the data into a contiguous buffer.
  - COMEX_ENABLE_GET_PACKED. Default 1 (on). Strided get will pack the data into a contiguous buffer.
  - COMEX_ENABLE_ACC_PACKED. Default 1 (on). Strided acc will pack the data into a contiguous buffer.
  - COMEX_ENABLE_PUT_IOV. Default 1 (on). Vector put will pack the data into a contiguous buffer.
  - COMEX_ENABLE_GET_IOV. Default 1 (on). Vector get will pack the data into a contiguous buffer.
  - COMEX_ENABLE_ACC_IOV. Default 1 (on). Vector acc will pack the data into a contiguous buffer.
  - COMEX_MAX_MESSAGE_SIZE. Default INT_MAX. All use of MPI will keep buffers less than this size. Sometimes useful in conjunction with eager thresholds to force all use of MPI below the eager threshold.
- armci-config and comex-config added
  - --blas_size
  - --use_blas
  - --network_ldflags
  - --network_libs
- ga-config added
  - --blas_size
  - --scalapack_size
  - --use_blas
  - --use_lapack
  - --use_scalapack
  - --use_peigs
  - --use_elpa
  - --use_elpa_2015
  - --use_elpa_2016
  - --network_ldflags
  - --network_libs
Changed
- Removed case statement from install-autotools.sh
Fixed
- install-autotools.sh works on FreeBSD
- patch locally built m4 for OSX High Sierra
Closed Issues Requests
- Scalapack with 8-byte integers? [#93]
- Please clarify what is "peigs" library [#96]
- additional arguments for bin/ga-config describing the presence of Peigs and/or Scalapack interfaces [#99]
- additional arguments for bin/ga-config describing the integer size of the Blas library used [#100]

Assets 3

21 Mar 17:49

jeffdaily

v5.6.4

a018f7e

v5.6.4

Known Bugs
- [#48] Message sizes exceeding 2GB may not work correctly
Added
- armci-config and comex-config scripts to install.
Changed
- install-autotools.sh installs all autotools regardless of existing versions
- configure tests needing mixed C/Fortran code now use C linker
Fixed
- Test suite was broken when GA was cross-compiled
- eliop FreeBSD patch from Debichem
- Locally installed automake is patched to work with newer perl versions
- MPI-PR increased limit on number of possible comex_malloc invocations
Closed Pull Requests
- [#92] eliop FreeBSD patch from Debian maintainers of the NWChem Package
Closed Issues Requests
- [#82] Fortran failure on theta
- [#88] Automake regex expression broken for Perl versions >=5.26.0
- [#89] autogen fails on Mac 10.12
- [#90] configure script fails when using clang-4/5 + gfortran 6.3 compilers on Linux
- [#95] comex/src-mpi-pr/comex.c:996: _generate_shm_name: Assertion 'snprintf_retval < (int)31' failed

Assets 3

09 Dec 01:04

jeffdaily

v5.6.3

224c371

v5.6.3

Known Bugs
- [#48] Message sizes exceeding 2GB may not work correctly
Fixed
- Critical bug, incorrect use of MPI_Comm_split() might prevent startup
  in the following ComEx ports.
  - MPI-PR
  - MPI-PT
  - MPI-MT

Assets 3

29 Sep 22:08

jeffdaily

v5.6.2

a73c92d

v5.6.2

Known Bugs
- [#48] Message sizes exceeding 2GB may not work correctly
Fixed
- Bug in MPI-PT comex_malloc().
- Revert ARMCI contiguous check due to regression.
- ELPA updates.
- ScaLAPACK updates, including case for large matrices.
- ComEx OFI updates from Intel.
- Improved configure tests for LAPACK.
- Improved travis tests.

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[5.8.2]

Releases: GlobalArrays/ga

v5.8.2

[5.8.2]

v5.8.1

v5.8

v5.7.2

v5.7.1

v5.7

v5.6.5

v5.6.4

v5.6.3

v5.6.2