GitHub

By executing main.cu or looking into output.txt you can compare different realisations of matrix multiplication.

"native" - native realisation

"modified native" - native realisation with a modified crawl sequence and with a little memory access optimization

"with shared memory" - using CUDA's shared memory

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
headers		headers
src		src
CMakeLists.txt		CMakeLists.txt
README.md		README.md
local_output.txt		local_output.txt
main.cu		main.cu
run.sh		run.sh
server_output.txt		server_output.txt

Provide feedback