Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instrumentation produces false measurements for NodeJS app using NAPI #96

Open
overlookmotel opened this issue Mar 16, 2024 · 4 comments

Comments

@overlookmotel
Copy link

overlookmotel commented Mar 16, 2024

I'm trying to benchmark a NodeJS app which calls into a NodeJS native module (built from Rust via napi-rs).

CodSpeed's instrumentation appears to slow down the app very considerably, making it hard to accurately measure performance.

To get a fair comparison, I'm running the same benchmark on Github Actions with and without CodSpeed.

Without CodSpeed:

Running benchmarks with CodSpeed DISABLED
┌─────────┬────────────────────────────────────────────┬─────────┬────────────────────┬──────────┬─────────┐
│ (index) │ Task Name                                  │ ops/sec │ Average Time (ns)  │ Margin   │ Samples │
├─────────┼────────────────────────────────────────────┼─────────┼────────────────────┼──────────┼─────────┤
│ 0       │ 'parser(napi)[checker.ts]'                 │ '4'     │ 238333877.50000006 │ '±4.73%' │ 10      │
│ 1       │ 'parser(napi)[cal.com.tsx]'                │ '8'     │ 118107410.40000008 │ '±1.21%' │ 10      │
│ 2       │ 'parser(napi)[RadixUIAdoptionSection.jsx]' │ '4,068' │ 245803.32628994508 │ '±0.68%' │ 2035    │
│ 3       │ 'parser(napi)[pdf.mjs]'                    │ '13'    │ 75663396.4999999   │ '±4.10%' │ 10      │
│ 4       │ 'parser(napi)[antd.js]'                    │ '1'     │ 545470710.6000001  │ '±3.60%' │ 10      │
└─────────┴────────────────────────────────────────────┴─────────┴────────────────────┴──────────┴─────────┘

With CodSpeed:

Running benchmarks with CodSpeed ENABLED
┌─────────┬────────────────────────────────────────────┬─────────┬────────────────────┬──────────┬─────────┐
│ (index) │ Task Name                                  │ ops/sec │ Average Time (ns)  │ Margin   │ Samples │
├─────────┼────────────────────────────────────────────┼─────────┼────────────────────┼──────────┼─────────┤
│ 0       │ 'parser(napi)[checker.ts]'                 │ '0'     │ 1676626826.099999  │ '±0.53%' │ 10      │
│ 1       │ 'parser(napi)[cal.com.tsx]'                │ '0'     │ 1028924682.7999946 │ '±0.45%' │ 10      │
│ 2       │ 'parser(napi)[RadixUIAdoptionSection.jsx]' │ '618'   │ 1615957.9129023065 │ '±0.81%' │ 310     │
│ 3       │ 'parser(napi)[pdf.mjs]'                    │ '1'     │ 568098107.800007   │ '±5.44%' │ 10      │
│ 4       │ 'parser(napi)[antd.js]'                    │ '0'     │ 3961387613.2999954 │ '±0.92%' │ 10      │
└─────────┴────────────────────────────────────────────┴─────────┴────────────────────┴──────────┴─────────┘

Github Action results: https://github.com/oxc-project/oxc/actions/runs/8308255059
Benchmark task: .github/workflows/benchmark.yml
Benchmark code: napi/parser/parse.bench.mjs

NB: The 2nd set of results are not from within the withCodspeed()-wrapped Bench object, because CodSpeeds's wrapper disables output of the results table. It appears that CodSpeed's instrumentation slows down everything in the process, not just the code which runs within the withCodspeed() wrapper.

These benchmarks use tinybench. I have also tried Vitest and benchmark.js, but they produce similar results.

Worse, CodSpeed's instrumentation appears to be preventing accurate assessment of the relative speed-up/down of changes. When run locally (Macbook Pro M1) or in CI without CodSpeed, this benchmark shows "napi raw" to be about 3x faster than "napi". But under CodSpeed, the results show only a marginal improvement, or in some cases show what's in reality a 3x speed-up as a slight slow-down.

I suspect the overhead CodSpeed introduces (and variance within in) is masking the actual performance gain.

Running benchmarks with CodSpeed DISABLED
┌─────────┬────────────────────────────────────────────────┬─────────┬────────────────────┬──────────┬─────────┐
│ (index) │ Task Name                                      │ ops/sec │ Average Time (ns)  │ Margin   │ Samples │
├─────────┼────────────────────────────────────────────────┼─────────┼────────────────────┼──────────┼─────────┤
│ 0       │ 'parser(napi)[checker.ts]'                     │ '4'     │ 236687380.7000001  │ '±4.51%' │ 10      │
│ 1       │ 'parser(napi raw)[checker.ts]'                 │ '9'     │ 105366553.09999992 │ '±8.60%' │ 10      │
│ 2       │ 'parser(napi)[cal.com.tsx]'                    │ '8'     │ 117207075.80000016 │ '±3.82%' │ 10      │
│ 3       │ 'parser(napi raw)[cal.com.tsx]'                │ '19'    │ 52539795.70000011  │ '±6.14%' │ 10      │
│ 4       │ 'parser(napi)[RadixUIAdoptionSection.jsx]'     │ '4,062' │ 246165.42568895963 │ '±2.25%' │ 2032    │
│ 5       │ 'parser(napi raw)[RadixUIAdoptionSection.jsx]' │ '8,890' │ 112482.59626630507 │ '±7.12%' │ 4446    │
│ 6       │ 'parser(napi)[pdf.mjs]'                        │ '13'    │ 71447427.5999998   │ '±5.49%' │ 10      │
│ 7       │ 'parser(napi raw)[pdf.mjs]'                    │ '30'    │ 32338048.750000097 │ '±5.52%' │ 16      │
│ 8       │ 'parser(napi)[antd.js]'                        │ '1'     │ 522781748.29999983 │ '±2.70%' │ 10      │
│ 9       │ 'parser(napi raw)[antd.js]'                    │ '4'     │ 203665995.90000004 │ '±3.86%' │ 10      │
└─────────┴────────────────────────────────────────────────┴─────────┴────────────────────┴──────────┴─────────┘
Running benchmarks with CodSpeed ENABLED
┌─────────┬────────────────────────────────────────────────┬─────────┬────────────────────┬──────────┬─────────┐
│ (index) │ Task Name                                      │ ops/sec │ Average Time (ns)  │ Margin   │ Samples │
├─────────┼────────────────────────────────────────────────┼─────────┼────────────────────┼──────────┼─────────┤
│ 0       │ 'parser(napi)[checker.ts]'                     │ '0'     │ 1712106154.2999873 │ '±0.56%' │ 10      │
│ 1       │ 'parser(napi raw)[checker.ts]'                 │ '0'     │ 2053894367.2000022 │ '±2.32%' │ 10      │
│ 2       │ 'parser(napi)[cal.com.tsx]'                    │ '0'     │ 1035820193.2000054 │ '±0.40%' │ 10      │
│ 3       │ 'parser(napi raw)[cal.com.tsx]'                │ '1'     │ 980151598.3000107  │ '±3.11%' │ 10      │
│ 4       │ 'parser(napi)[RadixUIAdoptionSection.jsx]'     │ '473'   │ 2112853.6624468486 │ '±9.21%' │ 237     │
│ 5       │ 'parser(napi raw)[RadixUIAdoptionSection.jsx]' │ '583'   │ 1713314.8047957036 │ '±2.26%' │ 292     │
│ 6       │ 'parser(napi)[pdf.mjs]'                        │ '1'     │ 570087846.4000125  │ '±5.78%' │ 10      │
│ 7       │ 'parser(napi raw)[pdf.mjs]'                    │ '1'     │ 709747156.7000029  │ '±3.52%' │ 10      │
│ 8       │ 'parser(napi)[antd.js]'                        │ '0'     │ 3989519725.7999973 │ '±0.86%' │ 10      │
│ 9       │ 'parser(napi raw)[antd.js]'                    │ '0'     │ 4119213271.400001  │ '±0.18%' │ 10      │
└─────────┴────────────────────────────────────────────────┴─────────┴────────────────────┴──────────┴─────────┘

https://github.com/oxc-project/oxc/actions/runs/8308560738

I don't know if this problem is related to the use of native addons within the NodeJS code under test, or a more general problem.

I would be more than happy to help in any way I can to diagnose/fix this. CodSpeed is absolutely brilliant for Rust code, and would very much like to expand our usage to NodeJS code too. Please just let me know what (if anything) I can do.

Boshen pushed a commit to oxc-project/oxc that referenced this issue Mar 16, 2024
#2724 added CodSpeed benchmarks for NodeJS `oxc-parser`.

Unfortunately it turns out CodSpeed's results are wildly inaccurate.
Unclear why, but have raised an issue with CodSpeed
(CodSpeedHQ/action#96). In meantime it seems
best to remove the benchmarks as they're not useful at present.
@adriencaccia
Copy link
Member

Hey @overlookmotel,

The slowdown of the whole code execution in the action is to be expected since the whole command is run under valgrind.

Regarding the problem of the relative performance gains/regressions, this might be linked with file system accesses and their handling in our measurement. I will take a look at the execution profiles to try and sort that out.

I will keep you posted here.

@overlookmotel
Copy link
Author

Thanks loads for coming back. I did expect there'd be some overhead, but x8 slowdown was more than I expected. I guess it doesn't matter much, as long as it affects everything equally, so can make comparisons.

The relative performance measure is more problematic.

The benchmark where I saw weird results does not involve any file system access.

It's basically comparing 2 methods of transferring data from Rust to JS.

  • In parseSync ("napi"), Rust outputs a JSON string and JS just calls JSON.parse() on it.
  • In parseSyncRaw ("napi raw"), Rust outputs a buffer and there's a deserializer written in JS which converts it to JS objects.

So the main differences between the two are:

  1. Passing string vs buffer via napi-rs.
  2. Just a single JSON.parse call vs a large amount of JS execution.

overlookmotel added a commit to oxc-project/oxc that referenced this issue Mar 20, 2024
Add NodeJS parser to benchmarks.

Previous attempt #2724 did not work due CodSpeed producing very
inaccurate results (CodSpeedHQ/action#96).

This version runs the actual benchmarks without CodSpeed's
instrumentation. Then another faux-benchmark runs within Codspeed's
instrumented action and just performs meaningless calculations in a loop
for as long as is required to take same amount of time as the original
uninstrumented benchmarks took.

It's unfortunate that we therefore don't get flame graphs on CodSpeed,
but this seems to be the best we can do for now.
@hyf0
Copy link

hyf0 commented Mar 31, 2024

Not sure how codspeed is implemented internally. I'm using setTimeout to fake the real costed time produced by benchmark without codspeed plugin. However, codspeed still could not make them right.

codspeed

image

real benchmark data(ms)

image

async function sleep(ms: number) {
  await new Promise((resolve) => globalThis.setTimeout(resolve, ms))
}

function main() {
  const realBenchData = JSON.parse(
    nodeFs.readFileSync(
      nodePath.join(PROJECT_ROOT, 'dist/ci-bench-data.json'),
      'utf8',
    ),
  )

  console.log('realBenchData:')
  console.table(realBenchData)

  for (const suite of suitesForCI) {
    const realData = realBenchData[suite.title]
    const realDataSourceMap = realBenchData[`${suite.title}-sourcemap`]
    nodeAssert(realData != null)
    nodeAssert(realDataSourceMap != null)
    bench(suite.title, async () => {
      await sleep(realData.mean)
    })
    bench(`${suite.title}-sourcemap`, async () => {
      await sleep(realDataSourceMap.mean)
    })
  }
}

main()

Not sure if it helps. This is a reproduction branch.

@overlookmotel
Copy link
Author

hyf0 I've replied on rolldown/rolldown#706 (which I think is what prompted your comment above).

@art049 art049 added the Migrated label Nov 25, 2024 — with Linear
@art049 art049 removed the Migrated label Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants