CG

Mihail Popov

and

pablooliveira

Initial Release

Nov 13, 2014

9681631 · Nov 13, 2014

History

This branch is 1 commit ahead of benchmark-subsetting/NPB3.0-omp-C:master.

Name	Name	Last commit message	Last commit date
parent directory ..
Makefile	Makefile	Initial Release	Nov 13, 2014
README.carefully	README.carefully	Initial Release	Nov 13, 2014
cg.c	cg.c	Initial Release	Nov 13, 2014

README.carefully

Note: please observe that in the routine conj_grad three 
implementations of the sparse matrix-vector multiply have
been supplied.  The default matrix-vector multiply is not
loop unrolled.  The alternate implementations are unrolled
to a depth of 2 and unrolled to a depth of 8.  Please
experiment with these to find the fastest for your particular
architecture.  If reporting timing results, any of these three may
be used without penalty.

Performance examples:
The non-unrolled version of the multiply is actually (slightly: 
maybe %5) faster on the sp2-66MHz-WN on 16 nodes than is the 
unrolled-by-2 version below.   On the Cray t3d, the reverse is true, 
i.e., the unrolled-by-two version is some 10% faster.  
The unrolled-by-8 version below is significantly faster
on the Cray t3d - overall speed of code is 1.5 times faster.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

CG

CG

README.carefully

Files

CG

Directory actions

More options

Directory actions

More options

Latest commit

History

CG

Folders and files

parent directory

README.carefully