shared-memory-sgd

C++ framework for implementing shared-memory parallel SGD for Deep Neural Network training

Report Bug · Request Feature

Table of Contents

About The Project
Getting Started
- Installation
Usage
- Input
- Ouput
- Examples
Roadmap
Contributing
License
Contact
Acknowledgements

About The Project

Framework for implementing parallel shared-memory Artificial Neural Network (ANN) training in C++ with SGD, supporting various synchronization mechanisms degrees of consistency. The code builds upon the MiniDNN implementation, and relies on Eigen and OpenMP. In particular, the project includes the implementation of LEASHED which guarantees consistency and lock-freedom.

For technical details of LEASHED-SGD and ASAP.SGD please see the original papers:

Bäckström, K., Walulya, I., Papatriantafilou, M., & Tsigas, P. (2021, February). Consistent Lock-free Parallel Stochastic Gradient Descent for Fast and Stable Convergence. In Proceedings of the 35th IEEE International Parallel & Distributed Processing Symposium. Full version.

Bäckström, K., Papatriantafilou, M., & Tsigas, P. (2022, July). ASAP-SGD: Instance-based Adaptiveness to Staleness in Asynchronous SGD. In Proceedings of the 39th International Conference on Machine Learning (to appear).

The following shared-memory parallel SGD algorithms are implemented:

Lock-based consistent asynchronous SGD
LEASHED - Lock-free implementation of consistent asynchronous SGD
Hogwild! - Lock-free asynchronous SGD without consistency
Synchronous parallel SGD

The following asynchrony-aware step size options are implemented:

The TAIL-TAU Staleness-adaptive step size
The FLeet staleness-adaptive step size [Damaskinos, G, et al. Middleware '20].
Standard 1/staleness inverse step size scaling/dampening

Getting Started

To get a local copy up and running follow these steps.

Installation

Clone the repo

git clone https://github.com/dcs-chalmers/shared-memory-sgd.git

Build project
```
bash build.sh
```
Compile
```
bash compile.sh
```

Usage

Input

Arguments and options - reference list:

Flag	Meaning	Values
`a`	algorithm	['ASYNC', 'HOG', 'LSH', 'SYNC']
`n`	n.o. threads	Integer
`A`	architecture	['MLP', 'CNN', 'LENET']
`L`	n.o. hidden layers	Integer (applies for MLP only)
`U`	n.o. hidden neurons per layer	Integer (applies for MLP only)
`B`	persistence bound	Integer (applies for LEASHED only)
`e`	n.o. epochs	Integer
`r`	n.o. rounds per epochs	Integer
`b`	mini-batch size	Integer
`l`	Step size	Float
`D`	Dataset	['MNIST', 'FASHION-MNIST', 'CIFAR10']
`t`	Staleness-adaptive step size strategy	['NONE', 'INVERSE', 'TAIL', 'FLEET']

to see all options:

./cmake-build/debug/mininn --help

Output

Output is a JSON object containing the following data:

Field	Meaning
`epoch_loss`	list of loss values corresponding to each epoch
`epoch_time`	wall-clock time measure upon completing corresponding epoch
`staleness_dist`	distribution of staleness
`numtriesdist`	distribution of n.o. CAS attempts (applies to LSH only)

Examples

Multi-layer perceptron (MLP) training for 5 epochs batch size 512 and step size 0.005 with 8 threads using LEASHED-SGD:

./cmake-build-debug/mininn -a LSH -n 8 -A MLP -L 3 -U 128 -e 5 -r 469 -b 512 -l 0.005

Multi-layer perceptron (MLP) training with 8 threads using Hogwild!:

./cmake-build-debug/mininn -a HOG -n 8 -A MLP -L 3 -U 128 -e 5 -r 469 -b 512 -l 0.005

Convolutional neural network (CNN) training with 8 threads using LEASHED-SGD:

./cmake-build-debug/mininn -a LSH -n 8 -A CNN -e 5 -r 469 -b 512 -l 0.005

Async-SGD LeNet training on CIFAR-10 with 16 threads, with and without TAIL-Tau:

./cmake-build-debug/mininn -a ASYNC -n 16 -A LENET -D 'CIFAR10' -e 100 -b 16 -l 0.005 -t TAIL

./cmake-build-debug/mininn -a ASYNC -n 16 -A LENET -D 'CIFAR10' -e 100 -b 16 -l 0.005 -t NONE

Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

Reference the repository and the papers

@misc{backstrom2021framework,
 author = {Bäckström, Karl},
 title = {shared-memory-sgd},
 year = {2021},
 publisher = {GitHub},
 journal = {GitHub repository},
 howpublished = {\url{https://github.com/dcs-chalmers/shared-memory-sgd}},
 commit = {XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX}
}
@inproceedings{backstrom2021consistent,
 title={Consistent lock-free parallel stochastic gradient descent for fast and stable convergence},
 author={B{\"a}ckstr{\"o}m, Karl and Walulya, Ivan and Papatriantafilou, Marina and Tsigas, Philippas},
 booktitle={2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)},
 pages={423--432},
 year={2021},
 organization={IEEE}
}

License

Distributed under the AGPL-3.0 License. See LICENSE for more information.

Contact

Karl Bäckström - [email protected]

Project Link: https://github.com/dcs-chalmers/shared-memory-sgd

Acknowledgements

A big thanks to the Wallenberg AI, Autonomous Systems and Software Program (WASP) for funding this work.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
cmake-build-debug		cmake-build-debug
data		data
include		include
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
build.sh		build.sh
compile.sh		compile.sh
main.cpp		main.cpp
mnist.cc		mnist.cc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

shared-memory-sgd

About The Project

Getting Started

Installation

Usage

Input

Output

Examples

Contributing

Reference the repository and the papers

License

Contact

Acknowledgements

About

Releases

Packages

Languages

License

dcs-chalmers/shared-memory-sgd

Folders and files

Latest commit

History

Repository files navigation

shared-memory-sgd

About The Project

Getting Started

Installation

Usage

Input

Output

Examples

Contributing

Reference the repository and the papers

License

Contact

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages