Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add proposal for single step multi version downgrade #136

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

MichaelMorrisEst
Copy link
Contributor

# Support Single step multi version downgrade for Zookeeper based clusters

This proposal seeks to introduce support for multi-version downgrade of Strimzi in a single step where possible.
The proposal only relates to Zookeeper based clusters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I commented on the strimzi/strimzi-kafka-operator#10801 issue. This does not seem to make any sense anymore:

  • At best, this would land in Strimzi 0.45 with support for Kafka 3.8 and 3.9.
  • These are the last versions with ZooKeeper support. ZooKeeper will be dropped in Strimzi 0.46 / Kafka 4.0
  • So there will be never a downgrade of a ZooKeeper-based cluster that would utilize this feature

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, doesn't make sense for Zookeeper at this stage. I've updated the proposal to address the same issue for KRaft based clusters instead

Copy link
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating this proposal to KRaft. I think this would be a good feature to have and in fact I had it somewhere deep on my list. But I think the proposal should be extended to cover:

  • More details of the implementation -> I assume you will need to somehow carry the information we are downgrading from unknown version in the code etc. This should be explained in more detail.
  • System test strategy -> Describe how will it be tested etc. Saying that we do not expect any system tests is probably an option ... but even that should be covered.
  • The risks tat are part of this proposal and how they will be documented
    • While most Kafka versions do not expect any special treatment when upgrading / downgrading, it is possible that there might be something needed for some versions. While for upgrade, this can be simply implemented in the new operator version, for a downgrade like this it will not be supported by the old operator. As such, it is unclear how will such a downgrade be supported, which versions will support it, how deep will it be possible etc.
    • Similarly, there are many issues with the operator versions as well. For example, in one of the future Strimzi versions where the dynamic KRaft qourum configuration will be supported - the new operator will know how to swicth between the static qourum configuration and dynamic qourum configuration and possibly the other way around. But the old operator version will not have any knowledge of this and will not be able to do this and the downgrade would simply break.
    • I do not think ^^^ these are blockers. But these risks need to be covered by the proosal and it needs to describe how we will protect the users from it but also how will the expectations be set for the users. E.g. by explaining that this will be something to be used at your own risk and that you are expected to test it for your combination of versions / configuration etc. This should likely also explain what exactly we test and what is just some code that does not block the downgrade, but offers not guarantees that it would work.

PS: If you write the proposal with a sentence-per-line, it will make it much easier to comment on it during the reviews than with paragraph-per-line. No need to change the existing proposal ... just something what might be helpful for the next one 😉.


During downgrade an attempt is made to read the Kafka information for the 'from' Kafka version by the 'to' version of the operator. When the 'from' Kafka version is not supported by the 'to' Strimzi version an error will be thrown because the version is unknown. Information such as the metadata version, interbroker protocol message format, log message format version are read from kafka-versions.yaml during the creation of a KafkaVersion object to represent the 'from' kafka version and the error message is generated as it cannot find the information for the unknown version.

However, while this information is important to know for the 'to' version in upgrade and downgrade, and for the 'from' version in upgrade, it is not important for the 'from' version in the downgrade as it is not subsequently used.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree it is not important. While currently, there are no explicit steps needed to do during downgrade, there might be in the future. So I think the statement that it is never needed is not true. (in the distant past, I think there were for example, some breaking ZooKeeper changes that might have included special steps at downgrade).

Copy link
Contributor

@PaulRMellor PaulRMellor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I left a few comments on wording for clarity. I can return when the proposal covers the implementation detail and handling differences in functionality mentioned by Jakub

087-single-step-multi-version-downgrade.md Outdated Show resolved Hide resolved
087-single-step-multi-version-downgrade.md Outdated Show resolved Hide resolved
087-single-step-multi-version-downgrade.md Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants