SOSP 2019 Experiments

This document describes how to run the main experiments in the SOSP 2019 paper. The goal of this document is to satisfy the ACM "Artifact Functional" requirements.

Most of the setup required in this document in done already in the Amazon AWS AMI image described below. In particular, the steps that are pre-installed on this AMI are marked as "(preinstalled)."

System Requirements

We have only tested the experiments on an m4.10xlarge Amazon EC2 instance. We have made a public AMI that can be used to run everything, with all necessary packages etc. pre-installed.

Field	Value
Cloud Provider	AWS
Region	us-west-2
AMI ID	ami-0312f629d1551e6b2
AMI Name	split-annotations-public-sosp19
Instance Type	m4.10xlarge

See this link for how to find and launch a public AMI (this assumes you have a valid billable AWS account setup).

For anyone running outside of this environment, the assumed system requirements are:

At least 150GB of RAM
At least 200GB of disk space
At least 16 cores that can compile an optimized version of Intel MKL
Running Ubuntu 16.04 with a recent Linux kernel (we've only tested it on 4.4.0)

C Experiments

(preinstalled) Follow the build instructions in the README in this directory. Add the following to your rc file:

export WELD_HOME=$HOME/weld/ # We will install Weld here for the Weld baselines
export SA_HOME=<path-to-this-repo> # this directory should live in $HOME.
export PATH=$SA_HOME/c/target/release:$PATH
export LD_LIBRARY_PATH=$SA_HOME/c/target/release:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$SA_HOME/c/lib/composer_mkl:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$SA_HOME/c/lib/ImageMagick:$LD_LIBRARY_PATH

And then:

# Or whatever your rc file is
source ~/.bashrc

(preinstalled) Install Intel MKL and ImageMagick, as described below. If these are preinstalled, skip to step 3 below.

Installing MKL (skip to Step 3 if preinstalled)

We tested our code with MKL 2018 (Update 2). To install, try the following:

wget http://registrationcenter-download.intel.com/akdlm/irc_nas/tec/12725/l_mkl_2018.2.199.tgz
tar zxvf l_mkl_2018.2.199.tgz
cd l_mkl_2018.2.199
./install.sh

and follow the on-screen instructions. If using EC2, we suggest using the second installation option, "Install using sudo privileges."

If the wget doesn't work, visit this link and follow the instructions below.

Fill out the information requested in the form and click "Submit"
In the dropdown menu stating "Please select a Product" choose "Intel Math Kernel Library for Linux"
Under "Choose a Version" choose "Intel MKL 2018 (Update 2)"
Right-click "Full Package" and copy the link. wget as above, and the continue below.

Once MKL is set up, make sure that the $MKLROOT environment variable is set to the correct value. On our system, it is set to the following:

/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl

We suggest adding it to your rc file:

export MKLROOT=/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl
source /opt/intel/bin/compilervars.sh intel64

And then source ~/.bashrc (or whatever your rc file is for your shell).

Installing ImageMagick (skip to Step 3 if preinstalled)

We use ImageMagick-7 in our benchmarks. To install:

Make sure build tools are available and up to date, and install libtiff5:

sudo apt-get update
sudo apt-get install build-essential libtiff5-dev
sudo ldconfig

Install ImageMagick from source:

cd $HOME
wget https://www.imagemagick.org/download/ImageMagick.tar.gz
tar xvzf ImageMagick.tar.gz
# Your minor version may be different, but the major version should be 7
cd ImageMagick-7.0.8-59

Configure, build and install:

./configure --with-tiff=yes
# Set to number of cores on your machine
make -j 40
sudo make install
sudo ldconfig

Make sure everything worked:

magick -version | head -1

You should see ImageMagick 7.xxxx.

Build the annotated libraries. Assuming $SA_HOME is the root directory:

cd $SA_HOME/c/lib/composer_mkl
make
cd $SA_HOME/c/lib/ImageMagick
make

Run the benchmarks using the provided script. We suggest doing this in a tmux session, since it will take some time to complete. This will also download all the data needed to run the benchmarks. Make sure you change to the correct directory first, because some things use relative directories:

cd $SA_HOME/c/benchmarks/
./run-all.sh

The results will be in the $SA_HOME/c/benchmarks/results directory.

Python Experiments (Mozart and Numba, Native Library, and Bohrium baselines)

(preinstalled) Install the necessary packages:

sudo apt-get install python2.7-dev python3.5-dev unzip virtualenv

To run the Python experiments, go to the benchmark directory and run the provided run-all.sh script. This will set up an environment and download the necessary data, and run everything:

cd $SA_HOME/python/benchmarks
./run-all.sh

The results will be in the $SA_HOME/python/benchmarks/results directory.

Weld Baselines

Since Weld requires slightly different Python distribution requirements and other dependencies, we run them in a separate virtual environment. Make sure everything is run from the appropriate directory (e.g., $HOME if cd $HOME is specified):

(preinstalled) Clone the Weld repo. Make sure you are the v0.2.0 branch, which supports multi-threading.

cd $HOME
git clone -b v0.2.0 https://github.com/weld-project/weld.git

(preinstalled) Make sure LLVM is installed and that everything is configured properly. In particular, you should be able to run llvm-config --version and see 6.x.x. If you don't have LLVM, run the following, which downloads all the Weld requirements:

wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | sudo apt-key add -
sudo apt-add-repository "deb http://apt.llvm.org/xenial/ llvm-toolchain-xenial-6.0 main"
sudo apt-get update

Then install:

sudo apt-get install llvm-6.0-dev clang-6.0 zlib1g-dev

And link:

sudo ln -s /usr/bin/llvm-config-6.0 /usr/local/bin/llvm-config

(preinstalled) Build Weld. You should already have Rust installed for Mozart:

cd $HOME/weld
cargo build --release

(preinstalled) Clone the Weld experiments.

cd $HOME
git clone https://github.com/sppalkia/weld-experiments-mozart.git

Run the experiments. This will build the Weld versions of Pandas and NumPy, setup a environment, install the requirements, and run each experiment. NOTE: these should be run after running the experiments in the main repository, because the run-all.sh script will generate data that this script accesses.

cd weld-experiments-mozart
# Run all the benchmarks
./run-all.sh

This should conclude the main results of the paper. Please email shoumik@cs.stanford.edu with any questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EXPERIMENTS.md

EXPERIMENTS.md

SOSP 2019 Experiments

System Requirements

C Experiments

Installing MKL (skip to Step 3 if preinstalled)

Installing ImageMagick (skip to Step 3 if preinstalled)

Python Experiments (Mozart and Numba, Native Library, and Bohrium baselines)

Weld Baselines

Files

EXPERIMENTS.md

Latest commit

History

EXPERIMENTS.md

File metadata and controls

SOSP 2019 Experiments

System Requirements

C Experiments

Installing MKL (skip to Step 3 if preinstalled)

Installing ImageMagick (skip to Step 3 if preinstalled)

Python Experiments (Mozart and Numba, Native Library, and Bohrium baselines)

Weld Baselines