Instrumentation produces false measurements for NodeJS app using NAPI #96

overlookmotel · 2024-03-16T15:11:35Z

I'm trying to benchmark a NodeJS app which calls into a NodeJS native module (built from Rust via napi-rs).

CodSpeed's instrumentation appears to slow down the app very considerably, making it hard to accurately measure performance.

To get a fair comparison, I'm running the same benchmark on Github Actions with and without CodSpeed.

Without CodSpeed:

Running benchmarks with CodSpeed DISABLED
┌─────────┬────────────────────────────────────────────┬─────────┬────────────────────┬──────────┬─────────┐
│ (index) │ Task Name                                  │ ops/sec │ Average Time (ns)  │ Margin   │ Samples │
├─────────┼────────────────────────────────────────────┼─────────┼────────────────────┼──────────┼─────────┤
│ 0       │ 'parser(napi)[checker.ts]'                 │ '4'     │ 238333877.50000006 │ '±4.73%' │ 10      │
│ 1       │ 'parser(napi)[cal.com.tsx]'                │ '8'     │ 118107410.40000008 │ '±1.21%' │ 10      │
│ 2       │ 'parser(napi)[RadixUIAdoptionSection.jsx]' │ '4,068' │ 245803.32628994508 │ '±0.68%' │ 2035    │
│ 3       │ 'parser(napi)[pdf.mjs]'                    │ '13'    │ 75663396.4999999   │ '±4.10%' │ 10      │
│ 4       │ 'parser(napi)[antd.js]'                    │ '1'     │ 545470710.6000001  │ '±3.60%' │ 10      │
└─────────┴────────────────────────────────────────────┴─────────┴────────────────────┴──────────┴─────────┘

With CodSpeed:

Running benchmarks with CodSpeed ENABLED
┌─────────┬────────────────────────────────────────────┬─────────┬────────────────────┬──────────┬─────────┐
│ (index) │ Task Name                                  │ ops/sec │ Average Time (ns)  │ Margin   │ Samples │
├─────────┼────────────────────────────────────────────┼─────────┼────────────────────┼──────────┼─────────┤
│ 0       │ 'parser(napi)[checker.ts]'                 │ '0'     │ 1676626826.099999  │ '±0.53%' │ 10      │
│ 1       │ 'parser(napi)[cal.com.tsx]'                │ '0'     │ 1028924682.7999946 │ '±0.45%' │ 10      │
│ 2       │ 'parser(napi)[RadixUIAdoptionSection.jsx]' │ '618'   │ 1615957.9129023065 │ '±0.81%' │ 310     │
│ 3       │ 'parser(napi)[pdf.mjs]'                    │ '1'     │ 568098107.800007   │ '±5.44%' │ 10      │
│ 4       │ 'parser(napi)[antd.js]'                    │ '0'     │ 3961387613.2999954 │ '±0.92%' │ 10      │
└─────────┴────────────────────────────────────────────┴─────────┴────────────────────┴──────────┴─────────┘

Github Action results: https://github.com/oxc-project/oxc/actions/runs/8308255059
Benchmark task: .github/workflows/benchmark.yml
Benchmark code: napi/parser/parse.bench.mjs

NB: The 2nd set of results are not from within the withCodspeed()-wrapped Bench object, because CodSpeeds's wrapper disables output of the results table. It appears that CodSpeed's instrumentation slows down everything in the process, not just the code which runs within the withCodspeed() wrapper.

These benchmarks use tinybench. I have also tried Vitest and benchmark.js, but they produce similar results.

Worse, CodSpeed's instrumentation appears to be preventing accurate assessment of the relative speed-up/down of changes. When run locally (Macbook Pro M1) or in CI without CodSpeed, this benchmark shows "napi raw" to be about 3x faster than "napi". But under CodSpeed, the results show only a marginal improvement, or in some cases show what's in reality a 3x speed-up as a slight slow-down.

I suspect the overhead CodSpeed introduces (and variance within in) is masking the actual performance gain.

Running benchmarks with CodSpeed DISABLED
┌─────────┬────────────────────────────────────────────────┬─────────┬────────────────────┬──────────┬─────────┐
│ (index) │ Task Name                                      │ ops/sec │ Average Time (ns)  │ Margin   │ Samples │
├─────────┼────────────────────────────────────────────────┼─────────┼────────────────────┼──────────┼─────────┤
│ 0       │ 'parser(napi)[checker.ts]'                     │ '4'     │ 236687380.7000001  │ '±4.51%' │ 10      │
│ 1       │ 'parser(napi raw)[checker.ts]'                 │ '9'     │ 105366553.09999992 │ '±8.60%' │ 10      │
│ 2       │ 'parser(napi)[cal.com.tsx]'                    │ '8'     │ 117207075.80000016 │ '±3.82%' │ 10      │
│ 3       │ 'parser(napi raw)[cal.com.tsx]'                │ '19'    │ 52539795.70000011  │ '±6.14%' │ 10      │
│ 4       │ 'parser(napi)[RadixUIAdoptionSection.jsx]'     │ '4,062' │ 246165.42568895963 │ '±2.25%' │ 2032    │
│ 5       │ 'parser(napi raw)[RadixUIAdoptionSection.jsx]' │ '8,890' │ 112482.59626630507 │ '±7.12%' │ 4446    │
│ 6       │ 'parser(napi)[pdf.mjs]'                        │ '13'    │ 71447427.5999998   │ '±5.49%' │ 10      │
│ 7       │ 'parser(napi raw)[pdf.mjs]'                    │ '30'    │ 32338048.750000097 │ '±5.52%' │ 16      │
│ 8       │ 'parser(napi)[antd.js]'                        │ '1'     │ 522781748.29999983 │ '±2.70%' │ 10      │
│ 9       │ 'parser(napi raw)[antd.js]'                    │ '4'     │ 203665995.90000004 │ '±3.86%' │ 10      │
└─────────┴────────────────────────────────────────────────┴─────────┴────────────────────┴──────────┴─────────┘

Running benchmarks with CodSpeed ENABLED
┌─────────┬────────────────────────────────────────────────┬─────────┬────────────────────┬──────────┬─────────┐
│ (index) │ Task Name                                      │ ops/sec │ Average Time (ns)  │ Margin   │ Samples │
├─────────┼────────────────────────────────────────────────┼─────────┼────────────────────┼──────────┼─────────┤
│ 0       │ 'parser(napi)[checker.ts]'                     │ '0'     │ 1712106154.2999873 │ '±0.56%' │ 10      │
│ 1       │ 'parser(napi raw)[checker.ts]'                 │ '0'     │ 2053894367.2000022 │ '±2.32%' │ 10      │
│ 2       │ 'parser(napi)[cal.com.tsx]'                    │ '0'     │ 1035820193.2000054 │ '±0.40%' │ 10      │
│ 3       │ 'parser(napi raw)[cal.com.tsx]'                │ '1'     │ 980151598.3000107  │ '±3.11%' │ 10      │
│ 4       │ 'parser(napi)[RadixUIAdoptionSection.jsx]'     │ '473'   │ 2112853.6624468486 │ '±9.21%' │ 237     │
│ 5       │ 'parser(napi raw)[RadixUIAdoptionSection.jsx]' │ '583'   │ 1713314.8047957036 │ '±2.26%' │ 292     │
│ 6       │ 'parser(napi)[pdf.mjs]'                        │ '1'     │ 570087846.4000125  │ '±5.78%' │ 10      │
│ 7       │ 'parser(napi raw)[pdf.mjs]'                    │ '1'     │ 709747156.7000029  │ '±3.52%' │ 10      │
│ 8       │ 'parser(napi)[antd.js]'                        │ '0'     │ 3989519725.7999973 │ '±0.86%' │ 10      │
│ 9       │ 'parser(napi raw)[antd.js]'                    │ '0'     │ 4119213271.400001  │ '±0.18%' │ 10      │
└─────────┴────────────────────────────────────────────────┴─────────┴────────────────────┴──────────┴─────────┘

https://github.com/oxc-project/oxc/actions/runs/8308560738

I don't know if this problem is related to the use of native addons within the NodeJS code under test, or a more general problem.

I would be more than happy to help in any way I can to diagnose/fix this. CodSpeed is absolutely brilliant for Rust code, and would very much like to expand our usage to NodeJS code too. Please just let me know what (if anything) I can do.

The text was updated successfully, but these errors were encountered:

#2724 added CodSpeed benchmarks for NodeJS `oxc-parser`. Unfortunately it turns out CodSpeed's results are wildly inaccurate. Unclear why, but have raised an issue with CodSpeed (CodSpeedHQ/action#96). In meantime it seems best to remove the benchmarks as they're not useful at present.

adriencaccia · 2024-03-18T19:11:22Z

Hey @overlookmotel,

The slowdown of the whole code execution in the action is to be expected since the whole command is run under valgrind.

Regarding the problem of the relative performance gains/regressions, this might be linked with file system accesses and their handling in our measurement. I will take a look at the execution profiles to try and sort that out.

I will keep you posted here.

overlookmotel · 2024-03-18T20:44:27Z

Thanks loads for coming back. I did expect there'd be some overhead, but x8 slowdown was more than I expected. I guess it doesn't matter much, as long as it affects everything equally, so can make comparisons.

The relative performance measure is more problematic.

The benchmark where I saw weird results does not involve any file system access.

It's basically comparing 2 methods of transferring data from Rust to JS.

In parseSync ("napi"), Rust outputs a JSON string and JS just calls JSON.parse() on it.
In parseSyncRaw ("napi raw"), Rust outputs a buffer and there's a deserializer written in JS which converts it to JS objects.

So the main differences between the two are:

Passing string vs buffer via napi-rs.
Just a single JSON.parse call vs a large amount of JS execution.

Add NodeJS parser to benchmarks. Previous attempt #2724 did not work due CodSpeed producing very inaccurate results (CodSpeedHQ/action#96). This version runs the actual benchmarks without CodSpeed's instrumentation. Then another faux-benchmark runs within Codspeed's instrumented action and just performs meaningless calculations in a loop for as long as is required to take same amount of time as the original uninstrumented benchmarks took. It's unfortunate that we therefore don't get flame graphs on CodSpeed, but this seems to be the best we can do for now.

hyf0 · 2024-03-31T14:02:41Z

Not sure how codspeed is implemented internally. I'm using setTimeout to fake the real costed time produced by benchmark without codspeed plugin. However, codspeed still could not make them right.

codspeed

real benchmark data(ms)

async function sleep(ms: number) {
  await new Promise((resolve) => globalThis.setTimeout(resolve, ms))
}

function main() {
  const realBenchData = JSON.parse(
    nodeFs.readFileSync(
      nodePath.join(PROJECT_ROOT, 'dist/ci-bench-data.json'),
      'utf8',
    ),
  )

  console.log('realBenchData:')
  console.table(realBenchData)

  for (const suite of suitesForCI) {
    const realData = realBenchData[suite.title]
    const realDataSourceMap = realBenchData[`${suite.title}-sourcemap`]
    nodeAssert(realData != null)
    nodeAssert(realDataSourceMap != null)
    bench(suite.title, async () => {
      await sleep(realData.mean)
    })
    bench(`${suite.title}-sourcemap`, async () => {
      await sleep(realDataSourceMap.mean)
    })
  }
}

main()

Not sure if it helps. This is a reproduction branch.

overlookmotel · 2024-03-31T16:13:54Z

hyf0 I've replied on rolldown/rolldown#706 (which I think is what prompted your comment above).

This was referenced Mar 16, 2024

AST transfer WIP oxc-project/oxc#2457

Draft

chore(napi): remove oxc-parser from benchmarks oxc-project/oxc#2738

Merged

overlookmotel mentioned this issue Mar 20, 2024

feat(tasks): benchmark NodeJS parser oxc-project/oxc#2770

Merged

overlookmotel mentioned this issue Mar 30, 2024

chore(ci/bench): tracking benchmarks of js instead of rust rolldown/rolldown#699

Merged

hyf0 mentioned this issue Mar 31, 2024

fix(node/bench): avoid affection of codspeed plugin in CI rolldown/rolldown#706

Closed

art049 added the Migrated label Nov 25, 2024 — with Linear

art049 removed the Migrated label Nov 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Instrumentation produces false measurements for NodeJS app using NAPI #96

Instrumentation produces false measurements for NodeJS app using NAPI #96

overlookmotel commented Mar 16, 2024 •

edited

Loading

adriencaccia commented Mar 18, 2024

overlookmotel commented Mar 18, 2024

hyf0 commented Mar 31, 2024 •

edited

Loading

overlookmotel commented Mar 31, 2024

Instrumentation produces false measurements for NodeJS app using NAPI #96

Instrumentation produces false measurements for NodeJS app using NAPI #96

Comments

overlookmotel commented Mar 16, 2024 • edited Loading

adriencaccia commented Mar 18, 2024

overlookmotel commented Mar 18, 2024

hyf0 commented Mar 31, 2024 • edited Loading

overlookmotel commented Mar 31, 2024

overlookmotel commented Mar 16, 2024 •

edited

Loading

hyf0 commented Mar 31, 2024 •

edited

Loading