Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

contrib/segmentio/kafka.go.v0: add DSM support #2625

Merged
merged 1 commit into from
Apr 5, 2024

Conversation

adrien-f
Copy link
Contributor

What does this PR do?

Add Data Streams Monitoring support to segmentio/kafka integration

Motivation

We currently use segmentio/kafka for our Go projects and it was easier to update the integration than change library.

Reviewer's Checklist

  • Changed code has unit tests for its functionality at or near 100% coverage.
  • System-Tests covering this feature have been added and enabled with the va.b.c-dev version tag.
  • There is a benchmark for any new code, or changes to existing code.
  • If this interacts with the agent in a new way, a system test has been added.
  • Add an appropriate team label so this PR gets put in the right place for the release notes.
  • Non-trivial go.mod changes, e.g. adding new modules, are reviewed by @DataDog/dd-trace-go-guild.

For Datadog employees:

  • If this PR touches code that handles credentials of any kind, such as Datadog API keys, I've requested a review from @DataDog/security-design-and-guidance.
  • This PR doesn't touch any of that.

Unsure? Have a question? Request a review!

@adrien-f adrien-f requested a review from a team as a code owner March 21, 2024 14:22
@darccio darccio added apm:ecosystem contrib/* related feature requests or bugs needs-triage New issues that have not yet been triaged labels Mar 21, 2024
datastreams.InjectToBase64Carrier(ctx, carrier)
}

func getProducerMsgSize(msg *kafka.Message) (size int64) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an alternative private implementation here:

https://github.com/segmentio/kafka-go/blob/ebca72eaee918d303c532feb7ff29afdcd8c2efa/message.go#L48-L50

I've kept with the one in the other integrations.

@darccio
Copy link
Member

darccio commented Mar 21, 2024

Hi @adrien-f, thanks for contributing this feature. I approved running the CI pipeline and there are some things to fix:

  • It seems to fail in kafka_go.go:210 with this error.
  • A little nitpick here: contrib/segmentio/kafka.go.v0/kafka.go:61:2: var-naming: struct field groupId should be groupID (revive)

@adrien-f
Copy link
Contributor Author

Many thanks @darccio !

I've pushed the groupID change 👍

Regarding the issue reported in the CI, it appears to be caused by the -race flag, I will investigate what's up with that.

@adrien-f
Copy link
Contributor Author

I believe I corrected the issue 👍

Happy weekend,

@darccio darccio removed the needs-triage New issues that have not yet been triaged label Mar 22, 2024
@@ -46,13 +48,17 @@ func WrapReader(c *kafka.Reader, opts ...Option) *Reader {
if c.Config().Brokers != nil {
wrapped.bootstrapServers = strings.Join(c.Config().Brokers, ",")
}

wrapped.groupID = c.Config().GroupID
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adrien-f Comparing with Shopify/sarama contrib, groupID comes from WithGroupID instead. I think it'll be better to be coherent with the other contrib, and avoid changing the default behaviour (working without GroupID).

Do you mind adding the option and removing this line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure, I'm also looking at the confluent-kafka-go integration and they have a WithConfig:

if groupID, err := cg.Get("group.id", ""); err == nil {
cfg.groupID = groupID.(string)
}

Which is called through this:

opts = append(opts, WithConfig(conf))
return WrapConsumer(c, opts...), nil

And I like this approach of enhancing the constructor transparently as users adopting DSM will only need to update and turn on the environment variable instead of having to explicitly pass the groupID.

This might be too much of a broad change for this PR though, what do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adrien-f If you want, I can contribute the change. It won't take me too much time. I'd rather avoid introducing a subtle breaking change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, I just wanted to be sure :) I've pushed the change!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adrien-f I don't see any new commit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@darccio @adrien-f the reason sarama has this group ID option is because the consumer group is not available otherwise (the usage of consumer groups is through a different api that we currently don't support, see: #2133).

In general, the implementation that we should be looking for reference is the confluent-kafka one: https://github.com/DataDog/dd-trace-go/blob/main/contrib/confluentinc/confluent-kafka-go/kafka.v2/option.go#L114-L116 -> the groupID is fetched from the kafka config actually used by the consumer.

Copy link
Member

@darccio darccio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@darccio darccio requested a review from rarguelloF March 27, 2024 08:26
@adrien-f
Copy link
Contributor Author

LGTM

Thanks for the review :shipit: !

@darccio
Copy link
Member

darccio commented Mar 27, 2024

Thanks for the review :shipit: !

You are welcome. I'll get a second review from Ecosystems, @rarguelloF, but it won't be done until next week (Holy week is close, so some Datadogs are already OOO).

Comment on lines 90 to 95
// WithGroupID tags the produced data streams metrics with the given groupID (aka consumer group)
func WithGroupID(groupID string) Option {
return func(cfg *config) {
cfg.groupID = groupID
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be fetched from the actual consumer config right?

You can access it in the WrapReader function using kafka.Reader.Config().GroupID.

Copy link
Contributor Author

@adrien-f adrien-f Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey :) This is what I had at first before changing it following #2625 (comment)

How would you see it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replied here: #2625 (comment)

Sorry for the confusion! @adrien-f but TLDR; the reason we have that option in Sarama is because we don't have a way to access the groupID there, but in this contrib and the confluent kafka one we do have the option to access it from the config, so this would make this WithGroupID option unnecessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation! I've restored this behavior 👍

@darccio darccio enabled auto-merge (squash) April 4, 2024 08:40
@darccio
Copy link
Member

darccio commented Apr 4, 2024

@adrien-f Please, merge main in your branch. This will trigger the CI and the automerging that I just activated. Thanks!

auto-merge was automatically disabled April 4, 2024 17:33

Head branch was pushed to by a user without write access

@adrien-f
Copy link
Contributor Author

adrien-f commented Apr 4, 2024

@darccio I used the rebase button/feature but looks like it removed the auto-merge, sorry about that 🙏

@darccio
Copy link
Member

darccio commented Apr 5, 2024

/merge

@dd-devflow
Copy link

dd-devflow bot commented Apr 5, 2024

🚂 MergeQueue

This merge request is not mergeable yet, because of pending checks/missing approvals. It will be added to the queue as soon as checks pass and/or get approvals.
Note: if you pushed new commits since the last approval, you may need additional approval.
You can remove it from the waiting list with /remove command.

Use /merge -c to cancel this operation!

@dd-devflow
Copy link

dd-devflow bot commented Apr 5, 2024

🚂 MergeQueue

Pull request added to the queue.

This build is going to start soon! (estimated merge in less than 0s)

Use /merge -c to cancel this operation!

@darccio
Copy link
Member

darccio commented Apr 5, 2024

/remove

@dd-devflow
Copy link

dd-devflow bot commented Apr 5, 2024

🚂 Devflow: /remove

@dd-devflow
Copy link

dd-devflow bot commented Apr 5, 2024

⚠️ MergeQueue

This merge request build was cancelled

If you need support, contact us on Slack #devflow!

@darccio darccio merged commit 531cae3 into DataDog:main Apr 5, 2024
98 of 101 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
apm:ecosystem contrib/* related feature requests or bugs mergequeue-status: removed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants