-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable the consumer to commit an offset with metadata #1067
Conversation
@flavienbert I have in my plans to merge commits into a single commit. Is that still possible when there is metadata attached to the commit? |
@erikvanoosten yes I think. The apache kafka consumer client take an |
Ah I see. When we merge commits, per partition we keep the I'll study this PR a bit more (especially for breaking changes) but so far it looks good to me. |
One point not handle by this PR: the metadata from the
This PR only able to publish metadata from consumer stream, not retrieve it when you consume a stream. Is it something you want to add too @erikvanoosten? I don't really know how to handle it because the
I believe this feature is not possible. It seems we can only get metadata with the Admin client |
We prefer smaller PRs. So another issue/pr can be created for this. |
As I mentioned in the previous comment, I think it's not possible to retrieve the metadata from a ZStream with the Consumer because the API |
Makes sense, the existence of records and commits are unrelated. |
Maybe we should put this method private: |
Are you referring to a I don't like that because it will mean that you can set the metadata but you can't read it back. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not done yet, but I wanted to submit these comments already.
@@ -349,6 +349,30 @@ object ConsumerSpec extends ZIOSpecDefaultSlf4j with KafkaRandom { | |||
.provideSomeLayer[Kafka](consumer(client, Some(group))) | |||
} yield assert(offsets.values.map(_.map(_.offset)))(forall(isSome(equalTo(nrMessages.toLong / nrPartitions)))) | |||
}, | |||
test("commits an offset with metadata") { | |||
for { | |||
// Produce messages on several partitions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is not correct. Please remove it.
@@ -125,7 +125,9 @@ private[consumer] final class Runloop private ( | |||
} yield () | |||
|
|||
private def doCommit(cmd: RunloopCommand.Commit): UIO[Unit] = { | |||
val offsets = cmd.offsets.map { case (tp, offset) => tp -> new OffsetAndMetadata(offset + 1) } | |||
val offsets = cmd.offsets.map { case (tp, offset) => | |||
tp -> new OffsetAndMetadata(offset.offset + 1, offset.metadata) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be on the safe side, we should retain the leader epoch as well (even though we don't support it at the moment).
Also, I keep thinking that we could hide this increaseOffsetByOne
operation to a class extension of OffsetAndMetadata
. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or perhaps we can move the +1
operation to where the OffsetAndMetadata is created... (I guess inside asJavaOffsetAndMetadata
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of disagree with you @erikvanoosten here. See #1067 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@guizmaii I have no strong opinion on this yet, I was just exploring possibilities. Sorry for that @flavienbert, this aspect is totally not relevant for this PR. If we want this change (probably not) we can do it in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@flavienbert Please don't forget what this thread started with:
To be on the safe side, we should retain the leader epoch as well (even though we don't support it at the moment).
yes I am referring to it:
It's true you can't read it back but you shouldn't need to read it. The purpose is to publish an offset with metadata.
In this code for sample a user can believe he can retrieve the metadata offset of the |
Okay, I am changing my mind 🙂 The What about adding methods |
Agree with you But in my case I need to set the metadata for each offset, and then the most recent offset is commit with the metadata set earlier for sample:
And what about having this model:
With this model, we could use |
Okay, I understand your use case, @guizmaii WDYT? |
I published a version with However, there is one breaking change: you have to explicitly convert @erikvanoosten let me know your point of view. |
@erikvanoosten any updates? Could be great to have the metadata commitment feature merged. |
I have asked @svroonland and @guizmaii to also take a look at this PR. Since we want to change this API in near future, and changing this many times is not nice for our users, for me source compatibility would be essential. Perhaps the other committers see this differently... |
newOffsets ++= offsets | ||
otherOffsets.offsets.foreach { case (tp, offset) => | ||
val existing = offsets.getOrElse(tp, -1L) | ||
if (existing < offset) | ||
val existing = offsets.getOrElse(tp, new OffsetAndMetadata(-1L)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably don't want to instantiate more than one OffsetAndMetadata(-1L)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this is important. Using a constant means less memory locality (making the L2 useless) while creating a new instance prevents inter-cpu coordination.
Instead I would propose this logic is rewritten such that creating a dummy OffsetAndMetadata is not even necessary.
zio-kafka/src/main/scala/zio/kafka/consumer/internal/Runloop.scala
Outdated
Show resolved
Hide resolved
zio-kafka/src/main/scala/zio/kafka/producer/TransactionalProducer.scala
Outdated
Show resolved
Hide resolved
zio-kafka-test/src/test/scala/zio/kafka/consumer/SubscriptionsSpec.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments
I think there is still some refinement work needed to make this PR mergeable
I rebased my branch, But the CI fail for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small suggestions. Otherwise LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one still need to be changed: https://github.com/zio/zio-kafka/pull/1067/files#r1354306081
Co-authored-by: Erik van Oosten <[email protected]>
@erikvanoosten You merge once the CI pass :) |
@flavienbert Thanks for your work :) |
@flavienbert Could you run |
Thanks for you patience @flavienbert ! This PR is now available in https://github.com/zio/zio-kafka/releases/tag/v2.6.0 🥳 |
Fixes #1066