Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report GCP profiles from zoekt-git-index #816

Merged
merged 3 commits into from
Sep 9, 2024
Merged

Report GCP profiles from zoekt-git-index #816

merged 3 commits into from
Sep 9, 2024

Conversation

jtibshirani
Copy link
Member

@jtibshirani jtibshirani commented Sep 6, 2024

This PR initializes the GCP profiler in the zoekt-git-index process so we can
examine CPU and memory usage for the indexing process itself.

@cla-bot cla-bot bot added the cla-signed label Sep 6, 2024
@jtibshirani jtibshirani changed the title Draft: report profiles from zoekt-git-index @jtibshirani Report GCP profiles from zoekt-git-index Sep 6, 2024
@jtibshirani jtibshirani changed the title @jtibshirani Report GCP profiles from zoekt-git-index Report GCP profiles from zoekt-git-index Sep 6, 2024
@jtibshirani jtibshirani marked this pull request as ready for review September 6, 2024 23:01
@jtibshirani
Copy link
Member Author

Looking for feedback on this one :) It's a bit weird to be initializing the GCP profiler in a subprocess, especially since GCP collects data every 1 minute for 10 seconds. However, the data looks quite good from local testing:

Screenshot 2024-09-06 at 3 44 43 PM

And in many of the cases we want to improve, indexing can take several minutes, and we can probably learn from this data. So it feels fine to have this as long as we understand the slow sampling rate compared to the process duration. What do you think?

@jtibshirani jtibshirani requested a review from a team September 6, 2024 23:07
@@ -107,6 +110,7 @@ func run() int {
opts.LanguageMap[m[0]] = ctags.StringToParser(m[1])
}

profiler.InitLightweight("zoekt-git-index", zoekt.Version)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reasoning for having a separate profiler config? Is mutex profiling polluting zoekt-git-index and that's not the case for webserver or indexserver?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, interesting in why we need a lighterweight indexing config here. Given this will only trigger for slower runs of zoekt-git-index, more info seems useful?

Copy link
Member Author

@jtibshirani jtibshirani Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing I found concerning was AllocForceGC: true. If I'm understanding correctly, it means that enabling profiling could significantly impact memory usage (lower peak usage due to more frequent GCs). So you could have weird things like "we switched off the GCP profiler and now memory usage is much higher."

However, it looks like we do this all over the place, including sg/sg, and I haven't heard of a problem. I'm probably just scarred from my Java days where WAY more stuff is on the heap and GC behavior is very core to memory usage.

I'll remove InitLightweight!

Copy link
Member

@keegancsmith keegancsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh man I thought up so much crazy overengineering to accomplish this. This is so nice and simple, awesome!

And in many of the cases we want to improve, indexing can take several minutes, and we can probably learn from this data.

yeah in fact that is perfect to only have our slow index jobs. I suspect the vast majority of zoekt-git-index calls finish in seconds, so only seeing the slow ones is great for having a better understanding of what we need to improve.

internal/profiler/profiler.go Outdated Show resolved Hide resolved
internal/profiler/profiler.go Outdated Show resolved Hide resolved
@@ -21,3 +21,17 @@ func Init(svcName, version string, blockProfileRate int) {
}
}
}

// InitLightweight starts the supported profilers IFF the environment variable is set.
// Compared to Init, it disables mutex profiling and forced GC to reduce its overhead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually don't know why we do AllocForceGC given the default is off. I wonder if it has a measurable impact on our perf. I suspect not. Then it makes sense. Why don't we use it for zoekt-git-index then?

@@ -107,6 +110,7 @@ func run() int {
opts.LanguageMap[m[0]] = ctags.StringToParser(m[1])
}

profiler.InitLightweight("zoekt-git-index", zoekt.Version)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, interesting in why we need a lighterweight indexing config here. Given this will only trigger for slower runs of zoekt-git-index, more info seems useful?

@jtibshirani jtibshirani merged commit 1ceac75 into main Sep 9, 2024
9 checks passed
@jtibshirani jtibshirani deleted the jtibs/profile branch September 9, 2024 17:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants