Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmark work estimate for simple Cg solve #1788

Open
wants to merge 3 commits into
base: benchmark_work_estimate_logger
Choose a base branch
from

Conversation

upsj
Copy link
Member

@upsj upsj commented Feb 17, 2025

As a starting point and example for adding work estimates to kernels, this adds the necessary operations to all non-trivial kernels in a simple unpreconditioned Cg solve.

Example output for the simple-solver example in a debug build

Runtime summary

name total total (self) count avg avg (self) performance
total 2.5 ms 832.3 us 1 2.5 ms 832.3 us
apply(gko::solver::Cg) 1.6 ms 17.9 us 1 1.6 ms 17.9 us
iteration 1.5 ms 508.1 us 19 81.2 us 26.7 us
check(gko::stop::Combined) 305.9 us 91.0 us 20 15.3 us 4.6 us
apply(gko::matrix::Identity) 268.0 us 101.2 us 20 13.4 us 5.1 us
apply(gko::matrix::Csr<double, int>) 235.1 us 122.9 us 19 12.4 us 6.5 us
check(gko::stop::ResidualNorm) 193.1 us 148.3 us 20 9.7 us 7.4 us
copy(gko::matrix::Dense,gko::matrix::Dense) 166.8 us 142.7 us 20 8.3 us 7.1 us
csr::spmv 112.2 us 112.2 us 19 5.9 us 5.9 us 363.2 MB/s
advanced_apply(gko::matrix::Csr<double, int>) 67.1 us 38.8 us 2 33.5 us 19.4 us
dense::compute_conj_dot_dispatch 58.1 us 58.1 us 39 1.5 us 1.5 us 204.1 MB/s
generate(gko::solver::Cg::Factory) 50.3 us 50.3 us 1 50.3 us 50.3 us
cg::step_2 46.8 us 46.8 us 19 2.5 us 2.5 us 370.7 MB/s
cg::step_1 43.6 us 43.6 us 19 2.3 us 2.3 us 198.6 MB/s
dense::compute_norm2_dispatch 36.7 us 36.7 us 22 1.7 us 1.7 us 91.1 MB/s
csr::advanced_spmv 28.3 us 28.3 us 2 14.1 us 14.1 us 162.3 MB/s
dense::copy 24.1 us 24.1 us 20 1.2 us 1.2 us 252.0 MB/s
check(gko::stop::Iteration) 21.7 us 21.7 us 20 1.1 us 1.1 us
allocate 18.5 us 18.5 us 31 598.0 ns 598.0 ns
residual_norm::residual_norm 17.2 us 17.2 us 20 858.0 ns 858.0 ns
components::aos_to_soa 16.9 us 16.9 us 3 5.6 us 5.6 us
cg::initialize 15.0 us 15.0 us 1 15.0 us 15.0 us 70.7 MB/s
components::convert_idxs_to_ptrs 13.5 us 13.5 us 1 13.5 us 13.5 us
free 13.3 us 13.3 us 31 430.0 ns 430.0 ns
dense::fill 13.2 us 13.2 us 4 3.3 us 3.3 us 24.3 MB/s
components::fill_array 6.3 us 6.3 us 1 6.3 us 6.3 us
dense::fill_in_matrix_data 3.9 us 3.9 us 2 1.9 us 1.9 us

Overhead estimate 482.5 us

Work estimates available for 14.9 % of runtime

@upsj upsj added the 1:ST:ready-for-review This PR is ready for review label Feb 17, 2025
@upsj upsj requested a review from a team February 17, 2025 17:28
@upsj upsj self-assigned this Feb 17, 2025
@ginkgo-bot ginkgo-bot added mod:core This is related to the core module. type:solver This is related to the solvers type:matrix-format This is related to the Matrix formats labels Feb 17, 2025
@upsj upsj force-pushed the benchmark_work_estimate_logger branch from eca15d0 to 9ac9b8f Compare February 20, 2025 11:34
@upsj upsj force-pushed the benchmark_work_estimate_cg_csr_spmv branch from 48ab867 to c24ccca Compare February 20, 2025 11:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1:ST:ready-for-review This PR is ready for review mod:core This is related to the core module. type:matrix-format This is related to the Matrix formats type:solver This is related to the solvers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants