-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
where resources already exist but requests are generated by CREATE
#2290
Comments
Hi @10000-ki there were improvements regarding to this in newer versions, pls upgrade to the newest version |
@csviri what version?? |
pls try the newest (4.8.1), if there will be still issues, we will take a look. Note that k8s 1.23 is also a bit old too, there were some issues with event sending in that version (what should be fixed now also on fabric8 client (restarting watches) ). But if the problem persist we can take a look, and try to make a reproducer. Probably the issue will be rather in the fabric8 client, since we are reading the last resource from the informer cache. |
I'm using in-house k8s cluster in my company but |
First, I'll increase the |
AFAIK, it should not be, there were known issues, that k8s stopped sending events at some point (if remember correctly still present in 1.23) therefore the watch had to be restarted in informers periodically. But that was done in fabric8, so now it should be good. |
Is there anything else I need to check? and Create issues are still occurring |
ok, could you pls create a minimal open source reproducert project? Where we can run it on top of minikube and take a look on it, based on logs you provided it is impossible to figure out. |
Also what you can you can check if this is a problem on the newer versions of k8s. |
okay i will try |
@csviri By the way, can you tell me exactly which version had the issue related to the |
Pls take a look on these: but there might be some newer ones. Maybe @shawkins knows more. |
I'm not aware of newer definitive underlying kube issues. There were several later client changes to limit any potential issues with the api server not sending events - we'll restart the watches at most every 10 minutes. |
I don't know if it's related to the problem I raised. The problems I defined are as follows.
Is it because the operator did not receive the resource creation event from k8s? |
yes, that means it is not in the cache of the informer, so no event received regarding that resource. |
i see
From the above, it seems to have been resolved in k8s v1.21, fabric v6.5.0, let me check a little more. |
And when i raise the josdk version to 4.4.4 -> 4.5.0, This seems to have been fixed in k8s v1.25 May I know why the problem occurs in v4.5.0 after no problem in v4.4.4? Is this modification relevant? |
In what situation it exactly goes up? my bet it is rather related to this change: f8137f2 |
This change seems to have an impact. |
yes, but this should not be a problem, since the resource is ideally updated only when desired changed, thus when there needs to be an update (patch with SSA in this case) anyways. So there would be anyway a resourceVersion change. Do you have an other controller making changes on the same resources? see also: #2249 |
A change occurs in The method |
nope |
Yes, but in the next reconciliation it update should be skipped. The matcher would detect that the resources are the same but also there as other mechanism to detect, that. Can you create a reproducer pls? This is not happening in our test scenarios. |
@csviri thank you for sharing
in my case It falls into an infinite loop. can you see this code? The actual code is on the github enterprise in my company. I just brought the operator part of the actual project |
@10000-ki this seems to be a relative complex project, could you pls create a simplistic reproducer - ideally with 1 dependent resource that causes the issue (and in java + maven If possible) that would speed up this very much, thx |
yes i will try |
@csviri |
thank you @10000-ki !! |
sorry :( i found it |
anyway |
This log shows 5 updates of 5 different resources by the operator sdk. The operator then skips propogating the update event for all of them, so this appears to be expected behavior. Can you find a situation in your logs where "Propagating event for ..." appears - the surrounding logs should help determine if that resourceVersion of that resource was the result of the operator in which case there would be an event cycle. |
@shawkins |
@shawkins |
New Object
Old Object
I don't know why resourceVersion is going up. I changed it to |
closes: operator-framework#2290 Signed-off-by: Steven Hawkins <[email protected]>
resourceVersions change for any modification, not just the spec. It's the generation that typically increments with spec changes. The primary problem is the one captured here: #2249 - if the matcher fails to recognize that the resource is the same, then it adds the new annotation value, which due to a kube bug will increment the Deployment generation, and the cycle repeats. The SSA logic is not trying to be generation aware in this case, but sees the generation change as something you are requesting in the desired state - which is not correct. #2309 will address this specific case by ignoring all server managed fields in the comparison - this is similar to what the kubernetes client does in the PatchUtils class when generating a patch. @csviri not sure if that is sufficient to fully close #2249 as the doc changes would be good too.
What do you mean? |
@shawkins oh i see..
https://javaoperatorsdk.io/docs/dependent-resources#comparing-desired-and-actual-state-matching
i found this guide, and turned |
After �i upgrade to v4.8.2
Is the k8s version bug the reason What's interesting is that this issue does not occur in josdk version v4.4.4 (under v4.5.0) |
@10000-ki can you create a simple reproducer pls? ( Ideally with 1 deployment dependent ) |
@10000-ki I should have realized with the previous incarnation of this issue that you were creating the desired state on top of the existing state - are you doing something like new StatefulSetBuilder(existing)... ? That is why the desired state had the generation set. Here with the state from the other field managers populated you see a similar issue. The SSA matching logic expects you to only provide your managed desired state when creating the entry - and nothing else. @csviri is it worth guarding against this more? |
The problem is the same even if we remove the n3r.navercorp.com annotations When i change the |
Please don't just remove the annotations, construct your desired state starting with a fresh object and don't copy anything from one obtained by the server - the point of SSA is that the server will figure out which manager is managing which fields. If the problem still persists after that it would be good to determine where the matching is failing. At the debug level, you should see logs like:
That will show what the matcher thinks your desired state is vs what is on the api server. Any descrepency between the two will cause another reconciliation. We can determine from there if there is SSA / Kube issue in play (that requires a workaround in the operator sdk) or if it's a problem with the matching logic.
That is expected. This optimization is based upon the matching logic working as intended. As mentioned on #2249 it is probably fine to just default this to false and allow users to opt in if they want greater performance. |
Bug Report
at io.javaoperatorsdk.operator.processing.dependent.AbstractDependentResource.handleCreate(AbstractDependentResource.java:114)
What did you do?
Creating PDB with DependentResource
However, even though resources are already created
When running additional 'reconcile' loop
The request was created as
Create
rather thanUpdate
If there is a secondary resource, shouldn't the request go to
update
right?Why is it created as
create
?409 errors continue to occur, which is a problem
What did you expect to see?
What did you see instead? Under which circumstances?
no
Environment
Kubernetes cluster type:
Kubernetes
$ Mention java-operator-sdk version from pom.xml file
v4.4.4
$ java -version
jdk21
$ kubectl version
v1.23.15
Possible Solution
no
Additional context
no
The text was updated successfully, but these errors were encountered: