-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
8 changed files
with
285 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
### Goto-if elseif-select case performance comparison, test 1 | ||
|
||
This test compare (computed) `goto` with `if` branching-flow construct. The selector for the branching-jump is computed pseudo-randomically and the *work* done inside the *workers* called by each branch is not uniform. | ||
|
||
This is a modification of [goto-if elseif-select case](https://github.com/szaghi/DEFY/tree/master/src/goto_is_fastest/goto_if_select_comparison_1) test proposed by Ron Shepard (select case is not considered into this test, rather the `block` construct). Essentially, the branching-flow is now *flushed*: the selector selects *from which keyword* start to call the workers and call not only the worker corresponding to that keyword, but also all subsequent workers, e.g. | ||
|
||
```fortran | ||
goto (1, 2, 3), keyword | ||
1 call worker1(keyword) | ||
2 call worker2(keyword) | ||
3 call worker3(keyword) | ||
``` | ||
if `keyword==1` all workers are called, while if `keyword==2` only worker 2 and 3 are called and finally if `keyword==3` only worker | ||
3 is called. This is compared with | ||
|
||
```fortran | ||
! if-based selector flow | ||
if (keyword<2) call worker1(keyword) | ||
if (keyword<3) call worker2(keyword) | ||
if (keyword<4) call worker3(keyword) | ||
! block-based selector flow (implies that the order of execution does not matter) | ||
selector: block | ||
call worker3(keyword) ; if ((keyword==3)) exit selector | ||
call worker2(keyword) ; if ((keyword>=2)) exit selector | ||
call worker1(keyword) ; exit selector | ||
end block selector | ||
``` | ||
|
||
In this case the `goto` should actually be advantaged, although the tests performed confirm again that the performance are almost identical. | ||
|
||
### Run test | ||
|
||
Four bash scripts are provided to run the test: | ||
|
||
1. `run_gnu.sh`, run the test with GNU gfortran compiler without optimizations; | ||
2. `run_gnu_optimized.sh`, run the test with GNU gfortran compiler with optimizations; | ||
3. `run_gnu.sh`, run the test with Intel Fortran Compiler without optimizations; | ||
4. `run_gnu_optimized.sh`, run the test with Intel Fortran Compiler with optimizations; | ||
|
||
### Results obtained | ||
|
||
|Compiler|Optimizations|Architecture | goto | if |block | | ||
|--------|-------------|-----------------------------------------------------|-----------|-----------|-----------| | ||
| GNU | yes |Intel Xeon [email protected], 24GB RAM, x86_64 Arch Linux|0.5480^10-4|0.5480^10-4|0.5480^10-4| | ||
| GNU | no |Intel Xeon [email protected], 24GB RAM, x86_64 Arch Linux|0.7578^10-3|0.7578^10-3|0.7578^10-3| | ||
| Intel | yes |Intel Xeon [email protected], 24GB RAM, x86_64 Arch Linux|0.5228^10-4|0.5237^10-4|0.5237^10-4| | ||
| Intel | no |Intel Xeon [email protected], 24GB RAM, x86_64 Arch Linux|0.9449^10-3|0.9550^10-3|0.9550^10-3| |
191 changes: 191 additions & 0 deletions
191
src/goto_is_fastest/goto_if_block_comparison_1/defy.f90
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,191 @@ | ||
! A DEFY (DEmystyfy Fortran mYths) test. | ||
! Author: Stefano Zaghi & Ron Shepard & FortranFan | ||
! Date: 2016-10-19 | ||
! | ||
! License: this file is licensed under the Creative Commons Attribution 4.0 license, | ||
! see http://creativecommons.org/licenses/by/4.0/ . | ||
|
||
program defy | ||
use iso_fortran_env | ||
implicit none | ||
integer(int32), parameter :: tests_number = 4000 | ||
integer(int32) :: keyword | ||
real(real64), allocatable :: key_work(:) | ||
real(real64) :: random | ||
integer(int64) :: profiling(1:2) | ||
integer(int64) :: count_rate | ||
real(real64) :: system_clocks(1:3) | ||
integer(int32) :: key_registers(1:9) | ||
integer(int32) :: i | ||
|
||
key_registers = 0 | ||
system_clocks = 0._real64 | ||
do i=1, tests_number | ||
call random_number(random) | ||
keyword = nint(random*9, int32) | ||
if (keyword==1) key_registers(1) = key_registers(1) + 1 | ||
if (keyword==2) key_registers(2) = key_registers(2) + 1 | ||
if (keyword==3) key_registers(3) = key_registers(3) + 1 | ||
if (keyword==4) key_registers(4) = key_registers(4) + 1 | ||
if (keyword==5) key_registers(5) = key_registers(5) + 1 | ||
if (keyword==6) key_registers(6) = key_registers(6) + 1 | ||
if (keyword==7) key_registers(7) = key_registers(7) + 1 | ||
if (keyword==8) key_registers(8) = key_registers(8) + 1 | ||
if (keyword==9) key_registers(9) = key_registers(9) + 1 | ||
|
||
call system_clock(profiling(1), count_rate) | ||
selector: block | ||
call worker9(key=keyword, array=key_work) ; if ((keyword==9)) exit selector | ||
call worker8(key=keyword, array=key_work) ; if ((keyword>=8)) exit selector | ||
call worker7(key=keyword, array=key_work) ; if ((keyword>=7)) exit selector | ||
call worker6(key=keyword, array=key_work) ; if ((keyword>=6)) exit selector | ||
call worker5(key=keyword, array=key_work) ; if ((keyword>=5)) exit selector | ||
call worker4(key=keyword, array=key_work) ; if ((keyword>=4)) exit selector | ||
call worker3(key=keyword, array=key_work) ; if ((keyword>=3)) exit selector | ||
call worker2(key=keyword, array=key_work) ; if ((keyword>=2)) exit selector | ||
call worker1(key=keyword, array=key_work) ; exit selector | ||
end block selector | ||
call system_clock(profiling(2), count_rate) | ||
system_clocks(1) = system_clocks(1) + real(profiling(2) - profiling(1), kind=real64)/count_rate | ||
|
||
call system_clock(profiling(1), count_rate) | ||
if (keyword<2) call worker1(key=keyword, array=key_work) | ||
if (keyword<3) call worker2(key=keyword, array=key_work) | ||
if (keyword<4) call worker3(key=keyword, array=key_work) | ||
if (keyword<5) call worker4(key=keyword, array=key_work) | ||
if (keyword<6) call worker5(key=keyword, array=key_work) | ||
if (keyword<7) call worker6(key=keyword, array=key_work) | ||
if (keyword<8) call worker7(key=keyword, array=key_work) | ||
if (keyword<9) call worker8(key=keyword, array=key_work) | ||
if (keyword<10) call worker9(key=keyword, array=key_work) | ||
call system_clock(profiling(2), count_rate) | ||
system_clocks(2) = system_clocks(2) + real(profiling(2) - profiling(1), kind=real64)/count_rate | ||
|
||
call system_clock(profiling(1), count_rate) | ||
goto (10, 20, 30, 40, 50, 60, 70, 80, 90), keyword | ||
10 call worker1(key=keyword, array=key_work) | ||
20 call worker2(key=keyword, array=key_work) | ||
30 call worker3(key=keyword, array=key_work) | ||
40 call worker4(key=keyword, array=key_work) | ||
50 call worker5(key=keyword, array=key_work) | ||
60 call worker6(key=keyword, array=key_work) | ||
70 call worker7(key=keyword, array=key_work) | ||
80 call worker8(key=keyword, array=key_work) | ||
90 call worker9(key=keyword, array=key_work) | ||
call system_clock(profiling(2), count_rate) | ||
system_clocks(3) = system_clocks(3) + real(profiling(2) - profiling(1), kind=real64)/count_rate | ||
enddo | ||
print '(A,9F12.5)', ' keywords distribution (1,2,3): ', key_registers*1._real32/tests_number | ||
print '(A,E23.15)', ' block average performance: ', system_clocks(2)/tests_number | ||
print '(A,E23.15)', ' if average performance: ', system_clocks(2)/tests_number | ||
print '(A,E23.15)', ' goto average performance: ', system_clocks(3)/tests_number | ||
|
||
contains | ||
pure subroutine worker1(key, array) | ||
integer(int32), intent(in) :: key | ||
real(real64), allocatable, intent(out) :: array(:) | ||
integer(int32) :: j | ||
|
||
allocate(array(1:key*tests_number)) | ||
array = 0._real64 | ||
do j=1, key*tests_number | ||
array(j) = key**2._real64 * tests_number * j | ||
enddo | ||
endsubroutine worker1 | ||
|
||
pure subroutine worker2(key, array) | ||
integer(int32), intent(in) :: key | ||
real(real64), allocatable, intent(out) :: array(:) | ||
integer(int32) :: j | ||
|
||
allocate(array(1:key*tests_number)) | ||
array = 0._real64 | ||
do j=1, key*tests_number | ||
array(j) = key**2._real64 * tests_number * j | ||
enddo | ||
endsubroutine worker2 | ||
|
||
pure subroutine worker3(key, array) | ||
integer(int32), intent(in) :: key | ||
real(real64), allocatable, intent(out) :: array(:) | ||
integer(int32) :: j | ||
|
||
allocate(array(1:key*tests_number)) | ||
array = 0._real64 | ||
do j=1, key*tests_number | ||
array(j) = key**2._real64 * tests_number * j | ||
enddo | ||
endsubroutine worker3 | ||
|
||
pure subroutine worker4(key, array) | ||
integer(int32), intent(in) :: key | ||
real(real64), allocatable, intent(out) :: array(:) | ||
integer(int32) :: j | ||
|
||
allocate(array(1:key*tests_number)) | ||
array = 0._real64 | ||
do j=1, key*tests_number | ||
array(j) = key**2._real64 * tests_number * j | ||
enddo | ||
endsubroutine worker4 | ||
|
||
pure subroutine worker5(key, array) | ||
integer(int32), intent(in) :: key | ||
real(real64), allocatable, intent(out) :: array(:) | ||
integer(int32) :: j | ||
|
||
allocate(array(1:key*tests_number)) | ||
array = 0._real64 | ||
do j=1, key*tests_number | ||
array(j) = key**2._real64 * tests_number * j | ||
enddo | ||
endsubroutine worker5 | ||
|
||
pure subroutine worker6(key, array) | ||
integer(int32), intent(in) :: key | ||
real(real64), allocatable, intent(out) :: array(:) | ||
integer(int32) :: j | ||
|
||
allocate(array(1:key*tests_number)) | ||
array = 0._real64 | ||
do j=1, key*tests_number | ||
array(j) = key**2._real64 * tests_number * j | ||
enddo | ||
endsubroutine worker6 | ||
|
||
pure subroutine worker7(key, array) | ||
integer(int32), intent(in) :: key | ||
real(real64), allocatable, intent(out) :: array(:) | ||
integer(int32) :: j | ||
|
||
allocate(array(1:key*tests_number)) | ||
array = 0._real64 | ||
do j=1, key*tests_number | ||
array(j) = key**2._real64 * tests_number * j | ||
enddo | ||
endsubroutine worker7 | ||
|
||
pure subroutine worker8(key, array) | ||
integer(int32), intent(in) :: key | ||
real(real64), allocatable, intent(out) :: array(:) | ||
integer(int32) :: j | ||
|
||
allocate(array(1:key*tests_number)) | ||
array = 0._real64 | ||
do j=1, key*tests_number | ||
array(j) = key**2._real64 * tests_number * j | ||
enddo | ||
endsubroutine worker8 | ||
|
||
pure subroutine worker9(key, array) | ||
integer(int32), intent(in) :: key | ||
real(real64), allocatable, intent(out) :: array(:) | ||
integer(int32) :: j | ||
|
||
allocate(array(1:key*tests_number)) | ||
array = 0._real64 | ||
do j=1, key*tests_number | ||
array(j) = key**2._real64 * tests_number * j | ||
enddo | ||
endsubroutine worker9 | ||
endprogram defy |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
#!/bin/bash | ||
# script to build and run DEFY tests. | ||
# | ||
# License: this file is licensed under the Creative Commons Attribution 4.0 license, | ||
# see http://creativecommons.org/licenses/by/4.0/ . | ||
|
||
test=$(basename $(pwd))/defy.f90 | ||
echo "Build and run $test by means of 'gfortran -Og'" | ||
gfortran -Og defy.f90 -o defy | ||
./defy | ||
rm -f defy |
11 changes: 11 additions & 0 deletions
11
src/goto_is_fastest/goto_if_block_comparison_1/run_gnu_optimized.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
#!/bin/bash | ||
# script to build and run DEFY tests. | ||
# | ||
# License: this file is licensed under the Creative Commons Attribution 4.0 license, | ||
# see http://creativecommons.org/licenses/by/4.0/ . | ||
|
||
test=$(basename $(pwd))/defy.f90 | ||
echo "Build and run $test by means of 'gfortran -O3'" | ||
gfortran -O3 defy.f90 -o defy | ||
./defy | ||
rm -f defy |
11 changes: 11 additions & 0 deletions
11
src/goto_is_fastest/goto_if_block_comparison_1/run_intel.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
#!/bin/bash | ||
# script to build and run DEFY tests. | ||
# | ||
# License: this file is licensed under the Creative Commons Attribution 4.0 license, | ||
# see http://creativecommons.org/licenses/by/4.0/ . | ||
|
||
test=$(basename $(pwd))/defy.f90 | ||
echo "Build and run $test by means of 'ifort -O0'" | ||
ifort -O0 defy.f90 -o defy | ||
./defy | ||
rm -f defy |
11 changes: 11 additions & 0 deletions
11
src/goto_is_fastest/goto_if_block_comparison_1/run_intel_optimized.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
#!/bin/bash | ||
# script to build and run DEFY tests. | ||
# | ||
# License: this file is licensed under the Creative Commons Attribution 4.0 license, | ||
# see http://creativecommons.org/licenses/by/4.0/ . | ||
|
||
test=$(basename $(pwd))/defy.f90 | ||
echo "Build and run $test by means of 'ifort -O3'" | ||
ifort -O3 defy.f90 -o defy | ||
./defy | ||
rm -f defy |