Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CCSD_T2_8 DGEMM w/ CUBLAS #1027

Merged
merged 43 commits into from
Oct 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
43ebd7f
this works
jeffhammond Oct 5, 2023
2fa9876
add DGEMM version too
jeffhammond Oct 5, 2023
8062b6d
straight DGEMM works
jeffhammond Oct 5, 2023
e80bac5
remove the loops - DGEMM will always be better
jeffhammond Oct 6, 2023
fc17d91
removing loops
jeffhammond Oct 6, 2023
b3ce4c1
cleanup
jeffhammond Oct 6, 2023
567fd44
do the pure DGEMM T2_8 in ICSD/NTS too
jeffhammond Oct 6, 2023
0a1b133
move makefile include to the top so we can use its vars
jeffhammond Oct 6, 2023
6de9fbe
still debugging
jeffhammond Oct 10, 2023
e78fb8e
so far, so good
jeffhammond Oct 10, 2023
9f12e90
so far, so good
jeffhammond Oct 10, 2023
409ba37
okay, it works correctly now
jeffhammond Oct 10, 2023
360e2e7
okay, it works correctly now
jeffhammond Oct 10, 2023
355a2c8
now time for double buffering
jeffhammond Oct 10, 2023
74ab406
clean up
jeffhammond Oct 10, 2023
34242be
arrays are column major. wow.
jeffhammond Oct 10, 2023
9335011
n stream version using n=1
jeffhammond Oct 10, 2023
2b18f0c
n stream version using n=1
jeffhammond Oct 10, 2023
12f47c9
2 phase version is correct
jeffhammond Oct 10, 2023
7cf47de
comment syntax
jeffhammond Oct 10, 2023
aec8780
move T2_7 into separate file
jeffhammond Oct 10, 2023
5d3d3da
fix non-F90 case
jeffhammond Oct 11, 2023
d18f3c8
move makefile include to the top so we can use its vars
jeffhammond Oct 6, 2023
00eed8c
still debugging
jeffhammond Oct 10, 2023
3258902
so far, so good
jeffhammond Oct 10, 2023
59475ec
so far, so good
jeffhammond Oct 10, 2023
cc742c8
okay, it works correctly now
jeffhammond Oct 10, 2023
8889f36
okay, it works correctly now
jeffhammond Oct 10, 2023
4feff45
now time for double buffering
jeffhammond Oct 10, 2023
5805b06
clean up
jeffhammond Oct 10, 2023
7113377
arrays are column major. wow.
jeffhammond Oct 10, 2023
a357f09
n stream version using n=1
jeffhammond Oct 10, 2023
c3ec460
n stream version using n=1
jeffhammond Oct 10, 2023
4faf45c
2 phase version is correct
jeffhammond Oct 10, 2023
e38ca4a
comment syntax
jeffhammond Oct 10, 2023
2abce96
move T2_7 into separate file
jeffhammond Oct 10, 2023
6affb37
reset generic input file
jeffhammond Oct 15, 2024
319b548
allow to pass 64_to_32 CI check
jeffhammond Oct 15, 2024
ced49c8
Merge branch 'ccsd_t2_dgemm_cublas' of https://github.com/jeffhammond…
jeffhammond Oct 15, 2024
ae928c7
fix 64_to_32 check
jeffhammond Oct 15, 2024
07beece
reset
jeffhammond Oct 15, 2024
707d974
Merge branch 'ccsd_t2_dgemm_cublas' of https://github.com/jeffhammond…
jeffhammond Oct 15, 2024
b1af9f9
fix 64_to_32 check again
jeffhammond Oct 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 15 additions & 7 deletions src/tce/ccsd/GNUmakefile
Original file line number Diff line number Diff line change
@@ -1,25 +1,34 @@
#$Id$

include ../../config/makefile.h

OBJ_OPTIMIZE = ccsd_e.o ccsd_t1.o ccsd_t2.o cc2_t1.o cc2_t2.o \
ccsd_1prdm.o ccsd_1prdm_hh.o ccsd_1prdm_hp.o \
ccsd_1prdm_ph.o ccsd_1prdm_pp.o \
icsd_t1.o icsd_t2.o \
ccsd_kernels.o ccsd_t2_8.o tce_1b_dens_print.o
ccsd_kernels.o ccsd_t2_7.o ccsd_t2_8.o tce_1b_dens_print.o

LIB_INCLUDES = -I../include

LIBRARY = libtce.a

USES_BLAS = ccsd_e.F ccsd_t1.F ccsd_t2.F cc2_t1.F cc2_t2.F \
ccsd_1prdm_hh.F ccsd_1prdm_hp.F ccsd_1prdm_ph.F \
ccsd_1prdm_pp.F ccsd_1prdm.F \
icsd_t1.F icsd_t2.F ccsd_t2_8.F ccsd_kernels.F sd_t2_8_loops.F

ccsd_1prdm_pp.F ccsd_1prdm.F ccsd_kernels.F sd_t2_8_loops.F \
icsd_t1.F icsd_t2.F ccsd_t2_7.F ccsd_t2_8.F

LIB_DEFINES = -DDEBUG_PRINT

# This replaces 3*TCE_SORT4+DGEMM with 6D loops (ccsd_kernels.F).
#LIB_DEFINES += -DUSE_LOOPS_NOT_DGEMM
# replace this with something better later
ifdef USE_OPENACC_TRPDRV
FOPTIONS += -DUSE_TCE_CUBLAS
ifeq ($(_FC),pgf90)
FOPTIONS += -Mextend -acc -cuda -cudalib=cublas
endif
ifeq ($(_FC),gfortran)
FOPTIONS += -ffree-form -fopenacc -lcublas
endif
endif

#
# Possible #defines
Expand All @@ -32,6 +41,5 @@ LIB_DEFINES = -DDEBUG_PRINT

HEADERS =

include ../../config/makefile.h
include ../../config/makelib.h

Loading
Loading