-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Certain performance counters are not measured and are always zero #1896
Comments
can you identify what in the benchmark code might be affecting this? we use a library for perf counting and other than the configuration we pass, it's not obvious to me what could cause such a bug unless it's with the perf counting library itself. |
@dmah42 actually I forgot to post an update to the issue, but a few days ago I realized that the issue is not with the specific counter, but rather with the order of arguments I pass in. Like in the issue description I'm passing in |
i suspect maybe something in the reporting side rather than the configuration side. i can't see anything obvious wrong in the perf counters out of curiousity, if you request JSON output instead do you get the results you expect? |
No, the same issue exists with JSON output. |
@mtrofin have you seen any issues like this? |
No. If you're passing 3 or less counter names, there should be no multiplexing either. Libpfm translates string counter names to their underlying IDs. To try isolate the problem, could you could try fetching the raw IDs from google/benchmark and feed those to |
Describe the bug
Certain counters such as
frontend_retired.l2_miss
are always reported as zero using the performance counters integration with libpfm.To confirm that this is indeed a bug I have made a small example where I'm flushing a function from all caches using the
clflush
instruction and then calling it, which should lead to an L2 miss. perf stat and record confirm thatfrontend_retired.l2_miss
should be non-zero and the miss is happening within the function, however the counters reported by google benchmark are always zero. I have also noticed this issue with a few other events such asfrontend_retired.latency_ge_1
.System
Which OS, compiler, and compiler version are you using:
To reproduce
Make a file
fun.S
with the following contents.The function
fun
flushesnot_fun
from all the caches and then calls it, which will cause a cache miss as it tries to fetch the first instruction of the function.Make two more files,
prog1.S
andprog2.cpp
respectively with the following contents.and a Makefile:
First, I run
prog1.out
along with perf to get the expected results without google benchmark's code interfering with everything. The code forprog1.S
is simply a loop which callsfun
1e7 times.As expected, we see L2 misses. Using perf record/sampling it can be confirmed that all the misses are inside
fun
Now, I run
prog2.out
to see what google benchmarks reports the counters as.and the L2 misses are zero.
For one more final confirmation we run google benchmark under perf stat - I'm not sure if both programs trying to use the same counters will mess something up, but anyways google benchmark again reports the counter values as zero whereas perf stat gives non-zero values. It is unlikely that these non-zero values are cause by google benchmarks code instead of the test code, as I have shown above that the test code on its own indeed causes the event as confirmed using perf stat + record.
Expected behavior
The counter values should be non-zero when it is clear that L2 cache misses are happening (confirmed with perf).
The text was updated successfully, but these errors were encountered: