To work with the required numeric bindings library, download the following version (other versions may not be compatible):
sudo wget https://mathema.tician.de/dl/software/boost-numeric-bindings/boost-numeric-bindings-20081116.tar.gz
tar -xvf boost-numeric-bindings-20081116.tar.gz
Retrieve the Chaos graph processing system source code:
git clone https://github.com/bindscha/chaos.git
It is helpful if you have permissions to the /usr/local/ repo
sudo chown -R ubuntu:ubuntu /usr/local/
wget https://sourceforge.net/projects/boost/files/boost/1.54.0/boost_1_54_0.tar.gz
tar -xvf boost_1_54_0.tar.gz
cd boost_1_54_0/
./bootstrap.sh
./b2 install
This is to support distributed communication and is a prerequisite for chaos
sudo apt-get install libzmq3-dev
Try compiling a test program
vi test.cpp
g++ test.cpp -o test -L/usr/local/lib/ -lboost_thread
rm test.cpp
cd chaos/
mkdir outputs
mkdir -p object_files
chmod +w object_files/
cd chaos
make clean
make
Copy the numeric-bindings into a location accessible by Chaos:
cd boost-numeric-bindings/
sudo cp -r boost/ /usr/include/
Errors will occur during compilation. In this case, inspect the source files, fix the issues, and recompile. An example of a common file that may need editing:
vi benchmarks/../algorithms/hyper-anf/hyper-anf.hpp (there will be multiple files like this)
make clean
make
This is for linear algebra computations
sudo apt-get install liblapack-dev
(ensure node denotes which node this is like master is 0 for me and workers are 1 and 2):
vi slipstore.ini
This has to be configured and make sure the interface for network connectivity is correct - for me it is ens5 - you can find it using:
ip link show
13. Generate Graph Partitions with RMAT(ensure node denotes which node this is like master is 0 for me and workers are 1 and 2):
To create graph partitions for a specific node (e.g., master node 0):
rmat --name test --scale 20 --edges 16777216 --xscale_interval 3 --xscale_node 0
Run PageRank with 10 iterations and 16 processing threads, adjusting the memory allocation as needed:
./bin/benchmark_driver -g test -b pagerank --pagerank::niters 10 -a -p 16 --physical_memory 268435456
Chaos uses a range of random ports between 5000 and 5024. Note chaos uses random ports from 5000-5024 (as per my observation), so all these have to be opened.
sudo ufw disable
- Ping machines is not a great idea to check connectivity because aws disables ping by default. If you want to ping a system and it is in your same vpc and subnet, you still have to enable ICMP to be able to do this
- I tried to run the S3 dataset directly but have not had any luck so far. We might have to download the 30gb dataset and manually partition it: ./bin/chaos -algo pagerank -input s3://data-graph-benchmarking/com-friendster.ungraph.txt -output /path/to/output -iters 20 -machines 3