Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dns: implement light dns caching #49560

Closed
wants to merge 15 commits into from

Conversation

kshitjj
Copy link

@kshitjj kshitjj commented Sep 8, 2023

Cache DNS results very lightly in node. Strawman: Cache for 1 second, refresh in the background, and delete cache after 1 second.

Fixes: #49394

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/net

@nodejs-github-bot nodejs-github-bot added dns Issues and PRs related to the dns subsystem. needs-ci PRs that need a full CI run. labels Sep 8, 2023
Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a better algorithm for this that avoids concurrent cache misses to trigger multiple dns requests.

lib/dns.js Outdated Show resolved Hide resolved
a lookup for, say, "constructor" or "toString" results in a cache hit

Co-authored-by: Ben Noordhuis <[email protected]>
@kshitjj
Copy link
Author

kshitjj commented Sep 9, 2023

@mcollina

I think I can implement a locking mechanism that would solve the concurrent cache missing problem.

@mcollina
Copy link
Member

mcollina commented Sep 9, 2023

I think it should cache the lookup operation.

@mcollina
Copy link
Member

mcollina commented Sep 9, 2023

cc @jasnell

lib/dns.js Outdated Show resolved Hide resolved
@ShogunPanda
Copy link
Contributor

ShogunPanda commented Sep 11, 2023

I think we need a better algorithm for this that avoids concurrent cache misses to trigger multiple dns requests.

I would go for a separate object that tracks all pending requests by hostname (and all other options, to be accurate). This way every time a lookup finishes, all pending request for that host can benefit from the single resolve.

I was thinking to something like:

pending[`${hostname}:${family}:${options]`].push(callback);

// ...

for(const cb of pending[`${hostname}:${family}:${options]`]) {
  process.nextTick(cb, null, result);
}

My only concern here is to have an option to disable such feature. If some hosts are configured with DNS Round Robin, caching results (or return a single lookup response for all concurrent lookups request) might interfere with it.

@kshitjj
Copy link
Author

kshitjj commented Sep 11, 2023

@mcollina @ShogunPanda I tested the new algorithm against multiple concurrent DNS requests with different websites, and it works just fine.

I have written the test script now I am tweaking it so it's by the node guideline.

Result for concurrent dns requests:
https://pastebin.com/3AxwzG5g
there were no misses.

@ShogunPanda
Copy link
Contributor

Can you also upload the test script?
Also, I think now in the callbacks you're not handling the all case.

@kshitjj
Copy link
Author

kshitjj commented Sep 11, 2023

@ShogunPanda I don't think there's an all condition in req.callback, as we are not passing all in callback in any way.

Working on the test script right now

@ShogunPanda
Copy link
Contributor

No, I'm talking about having all: true in the options, which changes the callback signature.

See: https://nodejs.org/dist/latest-v20.x/docs/api/dns.html#dnslookuphostname-options-callback

@kshitjj kshitjj force-pushed the async-dns-resolution branch from 5b28d66 to c32745b Compare September 11, 2023 17:05
lib/dns.js Outdated
function lookup(hostname, options, callback) {
let hints = 0;
let family = 0;
let all = false;
let verbatim = getDefaultVerbatim();

const cacheKey = `${hostname}_${family || 'default'}`;
Copy link
Member

@Linkgoron Linkgoron Sep 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this (and the code added before the if(!hostname) should move to before const req= new GetAddrInfoReqWrap(). The hostname variable should be used only after it passes validation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cache key should probably include if options.all is true or not

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought if there's a dns key for that domain that means it passed the validation earlier so there was no need for validating the hostname again. I could be wrong though

Also having the all in dns key makes a lot of sense I will add that.

lib/dns.js Outdated Show resolved Hide resolved
@mcollina
Copy link
Member

Good work! I think this would need some additional tests.

@kshitjj
Copy link
Author

kshitjj commented Sep 12, 2023

No, I'm talking about having all: true in the options, which changes the callback signature.

See: https://nodejs.org/dist/latest-v20.x/docs/api/dns.html#dnslookuphostname-options-callback

Oh I am sorry I didn't get you earlier, I will make callbacks for all condition.

lib/dns.js Outdated Show resolved Hide resolved
@@ -212,8 +245,22 @@ function lookup(hostname, options, callback) {
return {};
}

const cacheKey = `${hostname}_${family || 'default'}`;
const cachedResult = dnsCache[cacheKey];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A read-through cache approach would be a bit simpler here. essentially something like

cache.get(cacheKey, callback, () => {
  // If the cache lookup is a miss, this function is called to do the lookup and cache the result of the lookup.
});

This commit replaces the timer based approach for purging the expired
dns cache with lazy cleanup approach. The DNS cache is now cleaned
before adding a new DNS entry and before performing a lookup.

This change simplifies the code and eliminates the need for a separate
cleanup timer.
@kshitjj kshitjj force-pushed the async-dns-resolution branch from 49c4df3 to 090a1a0 Compare September 16, 2023 17:59
lib/dns.js Outdated
const currentTime = DateNow();
for (const key in dnsCache) {
if (currentTime - dnsCache[key].timestamp >= 1000) {
delete dnsCache[key];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to keep the object rather than a Map, just do a dnsCache[key] = undefined.

@mcollina mcollina added the semver-major PRs that contain breaking changes and should be released in the next major version. label Sep 19, 2023
@andymac4182
Copy link

Any chance of this landing before V21?

We have had to build our own version of DNS caching to solve this. It heavily impacts the scaleability of our system.

@kshitjj
Copy link
Author

kshitjj commented Oct 2, 2023

@andymac4182

I am sorry for the inconvenience, I will try my best to close this issue within this week.

@ShogunPanda
Copy link
Contributor

No worries, if this can't land in Node v21.0.0 as semver-major, we can modify it to be under a flag and therefore include it in v21.x.0 as semver-minor.

@andymac4182
Copy link

andymac4182 commented Oct 4, 2023

That would be great to be under a flag and shipped in V20 👍

@ShogunPanda Not sure if you got your versions miss written but V20.9 would be awesome to land this in 😉

@kshitjj
Copy link
Author

kshitjj commented Oct 6, 2023

test/parallel/test-dns-set-default-order, test-dns-default-verbatim-false, test-dns-default-verbatim-true is failing because getaddrinfo is being called twice instead of three as DNS cache is being used, so it isn't being looked up through getaddrinfo

The only way to pass the test is by changing the test

Rest of the tests are passing.

@kshitjj kshitjj closed this Jul 11, 2024
@kshitjj kshitjj deleted the async-dns-resolution branch January 8, 2025 00:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dns Issues and PRs related to the dns subsystem. needs-ci PRs that need a full CI run. semver-major PRs that contain breaking changes and should be released in the next major version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add global flag for enabling asynchronous DNS resolution
9 participants