Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fig_2_14 parallel_scan example code fails for non_zero v[0] #9

Open
jnorwood opened this issue Sep 7, 2020 · 0 comments
Open

fig_2_14 parallel_scan example code fails for non_zero v[0] #9

jnorwood opened this issue Sep 7, 2020 · 0 comments

Comments

@jnorwood
Copy link

jnorwood commented Sep 7, 2020

Using the book example code, I modified the initial values of input vector v to be i+2 instead of i. Running the serial sum returns 44 as last value, while the parallel_scan sum returns 42.

Fix for the fig_2_14 function which contains the parallel_scan is to remove the rsum[0] = v[0] an change the blocked_range to (0, N).

With those changes I get matching result for serial_rsum and parallel_rsum vectors

`
For N=8
input 2,3,4,5,6,7,8,9

combine (x=0,y=0)
sum[0..1), fs=True, 2
sum[2..3), fs=False, 4
sum[3..4), fs=False, 5
sum[6..7), fs=False, 8
combine(x=4,y=5)
sum[5..6), fs=False, 7
sum[1..2), fs=False, 3
sum[4..5), fs=False, 6
combine(x=2,y=3)
combine(x=6,y=7)
combine(x=5,y=9)
combine(x=14,y=13)
sum[1..2), fs=True, 5
combine(x=5,y=4)
sum[3..4), fs=True, 14
combine(x=14,y=6)
sum[2..3), fs=True, 9
combine(x=27,y=8)
sum[4..5), fs=True, 20
sum[7..8), fs=True, 44
sum[5..6), fs=True, 27
sum[6..7), fs=True, 35

parallel_sum = 2,5,9,14,20,27,35,44
`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant