-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cli: push tsdump upload failed metrics to datadog logs #137250
cli: push tsdump upload failed metrics to datadog logs #137250
Conversation
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aa-joshi and @angles-n-daemons)
-- commits
line 15 at r1:
Last 3 lines got in from the commit message template
pkg/cli/tsdump_upload.go
line 447 at r1 (raw file):
fmt.Println("\nPushing failed metrics to datadog logs...") for metric := range metricsUploadState.uploadFailedMetrics { wg.Add(1)
I have not seen a waitGroup being reused before (i.e calling Add()
again after a Wait()
call). Have you verified that this works?
pkg/cli/tsdump_upload.go
line 449 at r1 (raw file):
wg.Add(1) go func() { logMessage := fmt.Sprintf(logMessageFormat, metric)
The metric
should be passed as an arg to the anonymous function. Or this will have flaky behaviour. Because the it's being read concurrently and the underlying value can be mutated by the for loop. I am surprised this did not get flagged by the linter 🤔
for metric := range metricsUploadState.uploadFailedMetrics {
go func(m type) {
...
}(metric)
}
OR I think you can also do this
for metric := range metricsUploadState.uploadFailedMetrics {
metric := metric // create a locally scoped copy of metric
go func() {
...
}()
}
pkg/cli/tsdump_upload.go
line 474 at r1 (raw file):
fmt.Println("Failed to pushed some metrics to datadog logs. Please refer CLI output for all failed metrics.") } else { fmt.Println("Pushing failed metrics to datadog logs...done")
NIT
Something along this line might be more clear
Suggestion:
fmt.Println("Pushing logs of metric upload failures to datadog...done")
52db9f4
to
82eeb24
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @angles-n-daemons and @arjunmahishi)
Previously, arjunmahishi (Arjun Mahishi) wrote…
Last 3 lines got in from the commit message template
Thanks! I missed to remove them.
pkg/cli/tsdump_upload.go
line 447 at r1 (raw file):
Previously, arjunmahishi (Arjun Mahishi) wrote…
I have not seen a waitGroup being reused before (i.e calling
Add()
again after aWait()
call). Have you verified that this works?
Yes, I have verified the behaviour in local testing.
pkg/cli/tsdump_upload.go
line 449 at r1 (raw file):
Previously, arjunmahishi (Arjun Mahishi) wrote…
The
metric
should be passed as an arg to the anonymous function. Or this will have flaky behaviour. Because the it's being read concurrently and the underlying value can be mutated by the for loop. I am surprised this did not get flagged by the linter 🤔for metric := range metricsUploadState.uploadFailedMetrics { go func(m type) { ... }(metric) }OR I think you can also do this
for metric := range metricsUploadState.uploadFailedMetrics { metric := metric // create a locally scoped copy of metric go func() { ... }() }
Interesting! I didn't notice concurrency issue during my testing. The CLI & datadog line count was matching. However, it makes more sense to pass them as parameter. Thanks!
pkg/cli/tsdump_upload.go
line 474 at r1 (raw file):
Previously, arjunmahishi (Arjun Mahishi) wrote…
NIT
Something along this line might be more clear
DONE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @angles-n-daemons)
82eeb24
to
add416c
Compare
Previously, we were displaying failed tsdump upload metrics on CLI output. This was inadequate because CLI output might get truncated in case of high number of metric failure. To address this, this patch ships failed metrics to datadog as part of logs so that we can see all failed uploaded metrics as part of logs. Part of: CRDB-44835 Epic: None Release note: None
add416c
to
160db6f
Compare
TFTR! bors r+ |
Previously, we were displaying failed tsdump upload metrics on CLI output. This was inadequate because CLI output might get truncated in case of high number of metric failure. To address this, this patch ships failed metrics to datadog as part of logs so that we can see all failed uploaded metrics as part of logs.
Part of: CRDB-44835
Epic: None
Release note: None
Log snippet: