Add progress status for partition rebalances #140

kyguy · 2024-11-26T07:09:17Z

This proposal introduces a new feature to monitor the progression of an ongoing partition rebalance executed by a Strimzi-managed Cruise Control instance via a KafkaRebalance custom resource. Implementation of this proposal should help to address strimzi/strimzi-kafka-operator#10278

Signed-off-by: Kyle Liberti <[email protected]>

088-rebalance-progress-status

tomncooper

Had a first past. A lot of my comments are optional style/grammar/formatting suggestions, so feel free to ignore them.

My main comments are:

@scholzj makes a very good point about avoiding infinite reconciliation after a status update. You will need to solve that.
I think we should include a minimum estimated time for optimization proposals. Even if it is a ball park figure it is very useful guide. But lets see what others think.

088-rebalance-progress-status

fvaleri

Thanks @kyguy, this seems to be useful.

I left few comments for your consideration. Please, also fix formatting.

088-rebalance-progress-status

katheris

Generally the proposal looks good to me. I agree with the comments from others and just had one comment about the field name of percentageComplete and a suggestion for an additional field we could include

088-rebalance-progress-status

Signed-off-by: Kyle Liberti <[email protected]>

kyguy · 2025-01-29T00:28:24Z

Thanks again to everyone for the other rounds of review, we are getting pretty close to a proposal which everyone feels comfortable with. I have gone through the threads and marked those addressed as "resolved" so we can focus on the open threads and most recent discussions. If you feel the any threads marked as "resolved" have not been addressed thoroughly feel free to mark them as "unresolved" and I will take a another look.

Right now, what's left:

A review of the updated text in the motivation section.
A review of the updated notation of the formulas.
A review of the updated progress fields we plan to display in each KafkaRebalace state.

088-rebalance-progress-status.md

Signed-off-by: Kyle Liberti <[email protected]>

088-rebalance-progress-status.md

Signed-off-by: Kyle Liberti <[email protected]>

ppatierno

LGTM. There are a couple of nits but I am fine with the proposal overall, don't need another pass from my side. Thanks Kyle, nice work!

088-rebalance-progress-status.md

Signed-off-by: Kyle Liberti <[email protected]>

088-rebalance-progress-status.md

Signed-off-by: Kyle Liberti <[email protected]>

tinaselenge

Thanks @kyguy. The proposal looks good to me. I left just one question to clarify.

088-rebalance-progress-status.md

tomncooper

Just a couple of small nits and some clarifying comments that need adding IMHO. Then happy to approve.

088-rebalance-progress-status.md

tomncooper · 2025-02-14T12:01:46Z

088-rebalance-progress-status.md

+The rebalance is complete so we hardcode the value to `0`
+This emphasizes that the rebalance is complete and helps clear up ambiguity surrounding what the `Ready` state means in the `KafkaRebalance` resource.
+
+### Field: `completedByteMovementPercentage`


The "byte" part of this smells wrong, why not "data"?

It was raised in an earlier comment that "data" movement could be misinterpreted as "partition movement" instead of "byte movement". Naming this field with "byte" removes ambiguity surrounding what is being measured.

Not sure it would be easy to mix up data and partitions personally, but ok. I still think it would be better to have completedDataMovementPercentage and completedPartitionMovementPercentage then you can't mix them up.

I still think it would be better to have completedDataMovementPercentage and completedPartitionMovementPercentage then you can't mix them up.

I was worried that information from these two fields would be too similar, therefore, I was hoping to only supply one of them. However, I want the field name(s) to be as clear as possible to everyone. I am open to including both, it was something which @katheris raised in an earlier review too.

I would be interested in what @scholzj thinks of this.

tomncooper · 2025-02-14T12:02:23Z

088-rebalance-progress-status.md

+$$
+
+**Notes:**
+- $DMP$: The percentage of byte data that has been moved as a rounded down integer in the range [0-100], the value of the `completedByteMovementPercentage` field.


"byte data" sounds weird? You can probably just drop it an use data instead?

It does sound a little weird, the addition of "byte" here is related to the comment above, to clear up any ambiguity surrounding what kind of data is being moved.

What if we dropped "data" and just used "bytes" here instead?

088-rebalance-progress-status.md

fvaleri

Thank @kyguy for addressing my comments and refining the proposal.

I left few more to consider, but the overall approach looks good.

fvaleri · 2025-02-14T14:04:48Z

088-rebalance-progress-status.md

+Since the progress information is constant, we can safely add it to the existing `ConfigMap` maintained for and tied to the `KafkaRebalance` resource.
+This keeps `KafkaRebalance` information organized in one place, simplifies the proposal implementation, and has insignificant impact on the storage of the `ConfigMap`.


Are we sure that mixing unrelated information in the same CM is actually a good idea? What's the complication of having a dedicated progress CM?

Are we sure that mixing unrelated information in the same CM is actually a good idea?

Is the load and progress information really unrelated? Couldn't we as easily think of the information as being related to a specific rebalance?

What's the complication of having a dedicated progress CM?

We were debating whether or not to have a dedicated progress CM. One of the arguments against having a dedicated CM was that it would require extra API calls and code wrangling in the KafkaRebalanceAssemblyOperator class all while we already maintain ConfigMap for a KafkaRebalance resource with plenty of space for the progress information. We would also need to change the name of the existing ConfigMap to differentiate it from a new one since the existing name matches the name KafkaRebalance resource.

Yeah having just one simplifies the implementation as well. At the same time one place to look at for the user. Taking into account the amount the information we are adding, it sounded not taking any advantage from having two CMs.

Thanks for the clarifications. I can live with the single CM, but then I would prefer a single status field with a meaningful name, deprecating the old one. Have you considered this alternative design?

I can live with the single CM, but then I would prefer a single status field with a meaningful name, deprecating the old one. Have you considered this alternative design?

As in deprecating .status.optimizationResult.afterBeforeLoadConfigMap in the Kafka resource in favor of some new field like .status.rebalanceConfigMap, right?

This is a fair point.

API changes like this are definitely something to keep in mind before we move to our first major version of Strimzi. This comment along with the comment below are making me the more about the field name progress.rebalanceProgressConfigMap. Maybe it would be more prudent and future-proof to name the field something more generic like status.rebalanceConfigMap, especially if we plan on keeping the information from a rebalance, broker load and progress information, consolidated in a single ConfigMap. I guess it comes down to whether or not we plan on adding any additional rebalance information in the future that would require more space than a single ConfigMap could handle.

If users ever wanted the additional verbose output from the executor state we would definitely need the space of another ConfigMap and a separate field to point to that ConfigMap in the status. For this reason I am still leaning towards keeping separate, distinct fields but I am open to having a single field.

What do you think @fvaleri? Do you still think it would be better to have a single field?

Interested in what @ppatierno and @tomncooper think about too

Thanks for considering this change. As always, naming things is hard.

The following schema should give more flexibility:

# this proposal status: loadAndProgressConfigMaps: - my-rebalance # glimpse into a possible future status: loadAndProgressConfigMaps: - my-rebalance-load - my-rebalance-progress - my-rebalance-progress-verbose

With this proposal we have 2 fields (load and progress) referencing the same CM. If that CM every becomes to large then we could have each reference its own CM. If either of them become too big we probably need a whole different way of communicating that information to the user.

So I think it is better to stick with the current plan than deprecate an existing field and add 2 more.

Yeah I agree with Tom subscribing what he said. The proposal from Fede looks to be more "complicated" imho.

Unless there are any objections, let's stick with the current plan, we should have enough flexibility for now and the future. We can revisit if that changes in the future.

088-rebalance-progress-status.md

fvaleri · 2025-02-14T14:36:09Z

088-rebalance-progress-status.md

+  progress: [1]
+    rebalanceProgressConfigMap: my-rebalance [2]


What's the value of having this nested structure compared to just .status.rebalanceProgressConfigMap?

To further organize/distinguish the progress information from the other status fields.

Signed-off-by: Kyle Liberti <[email protected]>

fvaleri

LGTM. Left a suggestion for improving the status part.

Signed-off-by: Kyle Liberti <[email protected]>

tomncooper

LGTM @kyguy. My only comment is that completedByteMovement sounds weird and using data would be better, but that is not a blocker for me.

PaulRMellor

Sounds like a very useful addition. I left a few minor comments as I read through.

088-rebalance-progress-status.md

PaulRMellor · 2025-02-19T11:54:30Z

088-rebalance-progress-status.md

+
+[3] The “non-verbose” JSON payload from [/kafkacruisecontrol/state?substates=executor](https://github.com/linkedin/cruise-control/wiki/REST-APIs#query-the-state-of-cruise-control) endpoint.
+
+[4] The broker load from the optimization proposal as a JSON string that already maintained in the `ConfigMap`.


Suggested change

[4] The broker load from the optimization proposal as a JSON string that already maintained in the `ConfigMap`.

[4] The broker load from the optimization proposal as a JSON string that is already maintained in the `ConfigMap`.

I guess this is available for all states, but should we describe this in the table?

The broker load information is available for all these states but I don't think we should not include it in the table since it isn't progress information that we are adding as part of this proposal. That being said, I think we should add a table like this in the documentation including the broker load information for the implementation of this proposal.

088-rebalance-progress-status.md

tinaselenge

Thanks @kyguy for the proposal.

Signed-off-by: Kyle Liberti <[email protected]>

Add progress status for partition rebalances

169723b

Signed-off-by: Kyle Liberti <[email protected]>

kyguy requested review from tomncooper, scholzj, ppatierno and fvaleri November 26, 2024 07:10

scholzj reviewed Nov 26, 2024

View reviewed changes

088-rebalance-progress-status Outdated Show resolved Hide resolved

tomncooper suggested changes Nov 26, 2024

View reviewed changes

fvaleri reviewed Dec 2, 2024

View reviewed changes

088-rebalance-progress-status Outdated Show resolved Hide resolved

088-rebalance-progress-status Outdated Show resolved Hide resolved

088-rebalance-progress-status Outdated Show resolved Hide resolved

katheris reviewed Dec 2, 2024

View reviewed changes

088-rebalance-progress-status Outdated Show resolved Hide resolved

088-rebalance-progress-status Outdated Show resolved Hide resolved

ppatierno reviewed Dec 9, 2024

View reviewed changes

kyguy force-pushed the kr-exec-progress branch 7 times, most recently from d294906 to ff9df7e Compare December 11, 2024 00:07

Addressing feedback related to formatting/grammer

0f58cbb

Signed-off-by: Kyle Liberti <[email protected]>

kyguy force-pushed the kr-exec-progress branch from ff9df7e to 0f58cbb Compare December 11, 2024 00:58

Addressing feedback - js, ks, pp

d1433d8

Signed-off-by: Kyle Liberti <[email protected]>

kyguy force-pushed the kr-exec-progress branch 8 times, most recently from a310c4d to b55824e Compare December 18, 2024 02:10

Update wording and formatting

bc7f1ed

Signed-off-by: Kyle Liberti <[email protected]>

kyguy force-pushed the kr-exec-progress branch from b55824e to bc7f1ed Compare December 18, 2024 02:15

kyguy added 2 commits January 27, 2025 14:54

Addressing feedback - js

a23fea6

Signed-off-by: Kyle Liberti <[email protected]>

Addressing feedback - pp, js

abccfa7

Signed-off-by: Kyle Liberti <[email protected]>

ppatierno reviewed Jan 29, 2025

View reviewed changes

Addressing feedback - pp

2880fe7

Signed-off-by: Kyle Liberti <[email protected]>

ppatierno reviewed Jan 29, 2025

View reviewed changes

088-rebalance-progress-status.md Outdated Show resolved Hide resolved

Addressing feedback - pp

f4ff7a9

Signed-off-by: Kyle Liberti <[email protected]>

kyguy requested review from tomncooper and ppatierno February 11, 2025 19:50

ppatierno mentioned this pull request Feb 12, 2025

Replacing brokerLoad.json with brokerLoad within the rebalancing ConfigMap strimzi/strimzi-kafka-operator#11133

Closed

ppatierno approved these changes Feb 12, 2025

View reviewed changes

088-rebalance-progress-status.md Show resolved Hide resolved

088-rebalance-progress-status.md Outdated Show resolved Hide resolved

kyguy added 2 commits February 12, 2025 09:01

Addressing feedback - pp

7b6d20a

Signed-off-by: Kyle Liberti <[email protected]>

Fix time units in estimatedTimeToCompletionInMinutes formula

e713ac5

Signed-off-by: Kyle Liberti <[email protected]>

ppatierno reviewed Feb 13, 2025

View reviewed changes

088-rebalance-progress-status.md Outdated Show resolved Hide resolved

Update executorState -> executorState.json

32857a3

Signed-off-by: Kyle Liberti <[email protected]>

tinaselenge reviewed Feb 14, 2025

View reviewed changes

088-rebalance-progress-status.md Show resolved Hide resolved

tomncooper reviewed Feb 14, 2025

View reviewed changes

fvaleri reviewed Feb 14, 2025

View reviewed changes

kyguy requested review from tinaselenge, tomncooper and fvaleri February 14, 2025 19:37

Address comments - ts, tc

a3ea194

Signed-off-by: Kyle Liberti <[email protected]>

kyguy force-pushed the kr-exec-progress branch from f971608 to a3ea194 Compare February 14, 2025 21:56

im-konge approved these changes Feb 17, 2025

View reviewed changes

fvaleri approved these changes Feb 17, 2025

View reviewed changes

Switch rate time unit to seconds

6909ab5

Signed-off-by: Kyle Liberti <[email protected]>

tomncooper approved these changes Feb 18, 2025

View reviewed changes

PaulRMellor approved these changes Feb 19, 2025

View reviewed changes

tinaselenge approved these changes Feb 19, 2025

View reviewed changes

Addressing feedback - pm

28e5b34

Signed-off-by: Kyle Liberti <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add progress status for partition rebalances #140

Add progress status for partition rebalances #140

kyguy commented Nov 26, 2024 •

edited

Loading

tomncooper left a comment

fvaleri left a comment

katheris left a comment

kyguy commented Jan 29, 2025

ppatierno left a comment

tinaselenge left a comment

tomncooper left a comment

tomncooper Feb 14, 2025

kyguy Feb 14, 2025 •

edited

Loading

tomncooper Feb 18, 2025

kyguy Feb 18, 2025

tomncooper Feb 14, 2025

kyguy Feb 14, 2025

fvaleri left a comment

fvaleri Feb 14, 2025

kyguy Feb 14, 2025

ppatierno Feb 15, 2025

fvaleri Feb 16, 2025

kyguy Feb 17, 2025

fvaleri Feb 17, 2025 •

edited

Loading

tomncooper Feb 18, 2025

ppatierno Feb 18, 2025

kyguy Feb 18, 2025 •

edited

Loading

fvaleri Feb 14, 2025

kyguy Feb 14, 2025

fvaleri left a comment

tomncooper left a comment

PaulRMellor left a comment

PaulRMellor Feb 19, 2025

PaulRMellor Feb 19, 2025

kyguy Feb 19, 2025

tinaselenge left a comment

		Since the progress information is constant, we can safely add it to the existing `ConfigMap` maintained for and tied to the `KafkaRebalance` resource.
		This keeps `KafkaRebalance` information organized in one place, simplifies the proposal implementation, and has insignificant impact on the storage of the `ConfigMap`.


		[3] The “non-verbose” JSON payload from [/kafkacruisecontrol/state?substates=executor](https://github.com/linkedin/cruise-control/wiki/REST-APIs#query-the-state-of-cruise-control) endpoint.

		[4] The broker load from the optimization proposal as a JSON string that already maintained in the `ConfigMap`.

Add progress status for partition rebalances #140

Are you sure you want to change the base?

Add progress status for partition rebalances #140

Conversation

kyguy commented Nov 26, 2024 • edited Loading

tomncooper left a comment

Choose a reason for hiding this comment

fvaleri left a comment

Choose a reason for hiding this comment

katheris left a comment

Choose a reason for hiding this comment

kyguy commented Jan 29, 2025

ppatierno left a comment

Choose a reason for hiding this comment

tinaselenge left a comment

Choose a reason for hiding this comment

tomncooper left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kyguy Feb 14, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fvaleri left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fvaleri Feb 17, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kyguy Feb 18, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fvaleri left a comment

Choose a reason for hiding this comment

tomncooper left a comment

Choose a reason for hiding this comment

PaulRMellor left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tinaselenge left a comment

Choose a reason for hiding this comment

kyguy commented Nov 26, 2024 •

edited

Loading

kyguy Feb 14, 2025 •

edited

Loading

fvaleri Feb 17, 2025 •

edited

Loading

kyguy Feb 18, 2025 •

edited

Loading