Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arm sve ops #14

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
30 changes: 30 additions & 0 deletions ARM_SVE_README
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
Configuration and installation details please check script arm_install.sh

Run test: (Test example takes 4 args)
arg1 : elements count for operation
arg2 : elements type could be : i (integer), f (float), d (double)
arg3: type size in bits, only apply when you set arg2 to i. eg: i 8 will be converted to int8; i 16 to int16
arg4: operation type. Could be : max, min, sum , mul, band , bor, bxor
If you want to use SVE module for MPI ops, you need to pass mca params as : -mca op sve -mca op_sve_hardware_available 1
=======
Example for test
$PATH_To_BIN/mpirun -mca op sve -mca op_sve_hardware_available 1 -mca pml ob1 -np 1 armie -msve-vector-bits=256 --iclient libinscount_emulated.so --unsafe-ldstex -- /ccsopen/home/dzhong/Downloads/github/intel_to_arm/ompi/test/datatype/Reduce_local_float 33 i 8 min

If you don't need armie you can remove the ARMIE part in the command line as :
$PATH_To_BIN//mpirun -mca op sve -mca op_sve_hardware_available 1 -mca pml ob1 -np 1 /ompi/test/datatype/Reduce_local_float 33 i 8 min

How we evaluate the performance?
======
Logical:

Start_time;
MPI_reduce_local(...);
End_time;

Reduce_time = Start_time - End_time;

Possible issues (this happened on thunder2 machine):
======
Reason for "-mca pml ob1" : on Arm machine the default pml module will cause a problem with armie (instruction not supported, I don't know why), but with ob1 it works.


11 changes: 11 additions & 0 deletions arm_install.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
mkdir build

./autogen.pl >/dev/null

./configure --prefix=$PWD/build --enable-mpirun-prefix-by-default --enable-debug CC=armclang CFLAGS="-march=armv8-a+sve" CXX=armclang++ FC=armflang >/dev/null

./config.status >/dev/null
make -j 128 install >/dev/null

## compile the test code, test code under ompi/test/datapyte/Reduce_local_float.c
./build/bin/mpicc -g -O3 -march=armv8-a+sve -o ./test/datatype/Reduce_local_float ./test/datatype/Reduce_local_float.c
4 changes: 3 additions & 1 deletion ompi/mca/op/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
# University Research and Technology
# Corporation. All rights reserved.
# Copyright (c) 2004-2005 The University of Tennessee and The University
# Copyright (c) 2004-2020 The University of Tennessee and The University
# of Tennessee Research Foundation. All rights
# reserved.
# Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
Expand All @@ -17,6 +17,8 @@
# $HEADER$
#

AM_CPPFLAGS = $(LTDLINCL)

# main library setup
noinst_LTLIBRARIES = libmca_op.la
libmca_op_la_SOURCES =
Expand Down
70 changes: 70 additions & 0 deletions ompi/mca/op/sve/Makefile.am
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
#
# Copyright (c) 2019 The University of Tennessee and The University
# of Tennessee Research Foundation. All rights
# reserved.
# $COPYRIGHT$
#
# Additional copyrights may follow
#
# $HEADER$
#

# This is an sve op component. This Makefile.am is a typical
# sve of how to integrate into Open MPI's Automake-based build
# system.
#
# See https://github.com/open-mpi/ompi/wiki/devel-CreateComponent
# for more details on how to make Open MPI components.

# First, list all .h and .c sources. It is necessary to list all .h
# files so that they will be picked up in the distribution tarball.

sources = \
op_sve.h \
op_sve_component.c \
op_sve_functions.h \
op_sve_functions.c

# Open MPI components can be compiled two ways:
#
# 1. As a standalone dynamic shared object (DSO), sometimes called a
# dynamically loadable library (DLL).
#
# 2. As a static library that is slurped up into the upper-level
# libmpi library (regardless of whether libmpi is a static or dynamic
# library). This is called a "Libtool convenience library".
#
# The component needs to create an output library in this top-level
# component directory, and named either mca_<type>_<name>.la (for DSO
# builds) or libmca_<type>_<name>.la (for static builds). The OMPI
# build system will have set the
# MCA_BUILD_ompi_<framework>_<component>_DSO AM_CONDITIONAL to indicate
# which way this component should be built.

if MCA_BUILD_ompi_op_sve_DSO
component_noinst =
component_install = mca_op_sve.la
else
component_install =
component_noinst = component_noinst
endif

# Specific information for DSO builds.
#
# The DSO should install itself in $(ompilibdir) (by default,
# $prefix/lib/openmpi).

mcacomponentdir = $(ompilibdir)
mcacomponent_LTLIBRARIES = $(component_install)
mca_op_sve_la_SOURCES = $(sources)
mca_op_sve_la_LDFLAGS = -module -avoid-version
mca_op_sve_la_LIBADD = $(top_builddir)/ompi/lib@[email protected]

# Specific information for static builds.
#
# Note that we *must* "noinst"; the upper-layer Makefile.am's will
# slurp in the resulting .la library into libmpi.

noinst_LTLIBRARIES = $(component_noinst)
libmca_op_sve_la_SOURCES = $(sources)
libmca_op_sve_la_LDFLAGS = -module -avoid-version
21 changes: 21 additions & 0 deletions ompi/mca/op/sve/configure.m4
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# -*- shell-script -*-
#
# Copyright (c) 2019-2020 The University of Tennessee and The University
# of Tennessee Research Foundation. All rights
# reserved.
#
# $COPYRIGHT$
#
# Additional copyrights may follow
#
# $HEADER$
#

# MCA_ompi_op_sve_CONFIG([action-if-can-compile],
# [action-if-cant-compile])
# ------------------------------------------------
# We can always build, unless we were explicitly disabled.
AC_DEFUN([MCA_ompi_op_sve_CONFIG],[
AC_CONFIG_FILES([ompi/mca/op/sve/Makefile])
[$1],
])dnl
64 changes: 64 additions & 0 deletions ompi/mca/op/sve/op_sve.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
/*
* Copyright (c) 2019 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
*
* Copyright (c) 2019 Arm Ltd. All rights reserved.
*
* $COPYRIGHT$
*
* Additional copyrights may follow
*
* $HEADER$
*/

#ifndef MCA_OP_SVE_EXPORT_H
#define MCA_OP_SVE_EXPORT_H

#include "ompi_config.h"

#include "ompi/mca/mca.h"
#include "opal/class/opal_object.h"

#include "ompi/mca/op/op.h"

BEGIN_C_DECLS

/**
* Derive a struct from the base op component struct, allowing us to
* cache some component-specific information on our well-known
* component struct.
*/
typedef struct {
/** The base op component struct */
ompi_op_base_component_1_0_0_t super;

/* What follows is sve-component-specific cached information. We
tend to use this scheme (caching information on the sve
component itself) instead of lots of individual global
variables for the component. The following data fields are
sves; replace them with whatever is relevant for your
component. */

/** A simple boolean indicating that the hardware is available. */
bool hardware_available;

/** A simple boolean indicating whether double precision is
supported. */
bool double_supported;
} ompi_op_sve_component_t;

/**
* Globally exported variable. Note that it is a *sve* component
* (defined above), which has the ompi_op_base_component_t as its
* first member. Hence, the MCA/op framework will find the data that
* it expects in the first memory locations, but then the component
* itself can cache additional information after that that can be used
* by both the component and modules.
*/
OMPI_DECLSPEC extern ompi_op_sve_component_t
mca_op_sve_component;

END_C_DECLS

#endif /* MCA_OP_SVE_EXPORT_H */
Loading