-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds first draft at a cluster debugging page #2736
base: main
Are you sure you want to change the base?
Conversation
- Could do with some more examples and screenshots
- Check that the path the cluster definition was merged to is being reconciled by flux | ||
- Check that there were no errors in the kustomization resource that applied the cluster definition | ||
- There may be a k8s validation error like a bad namespace in the cluster definition | ||
- The CAPI provider may not be installed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The CAPI provider may not be installed | |
- The CAPI provider may not be installed resulting in a missing CRD error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would they find this?
- Some providers like CAPD can be quite sensitive to your docker state | ||
- make sure you don't have a lot of other old clusters running. | ||
- Try a different cluster name, some old resources may not been cleaned up | ||
- No CNI may have been installed on your clusters. Make sure a ClusterResourceSet is configured to do this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can link to the documentation for this?
|
||
- Check the logs of the capi controllers, it may be failing to create the cluster | ||
- Some providers like CAPD can be quite sensitive to your docker state | ||
- make sure you don't have a lot of other old clusters running. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we provide instructions?
- Check the logs of the capi controllers, it may be failing to create the cluster | ||
- Some providers like CAPD can be quite sensitive to your docker state | ||
- make sure you don't have a lot of other old clusters running. | ||
- Try a different cluster name, some old resources may not been cleaned up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can they check this?
- Check that the path the cluster definition was merged to is being reconciled by flux | ||
- Check that there were no errors in the kustomization resource that applied the cluster definition | ||
- There may be a k8s validation error like a bad namespace in the cluster definition | ||
- The CAPI provider may not be installed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we link to the documentation that explains how to install CAPI providers?
|
||
## Cluster does not appear in the UI after merging the PR | ||
|
||
- Check that the path the cluster definition was merged to is being reconciled by flux |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we provide instructions here?
## Cluster does not appear in the UI after merging the PR | ||
|
||
- Check that the path the cluster definition was merged to is being reconciled by flux | ||
- Check that there were no errors in the kustomization resource that applied the cluster definition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we provide an appropriate kubectl
command for users to run?
- Check that the path the cluster definition was merged to is being reconciled by flux | ||
- Check that there were no errors in the kustomization resource that applied the cluster definition | ||
- There may be a k8s validation error like a bad namespace in the cluster definition | ||
- The CAPI provider may not be installed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would they find this?
|
||
## Cluster does not transition to ready | ||
|
||
- Check the logs of the capi controllers, it may be failing to create the cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's include an appropriate kubectl
command to get these logs?
|
||
## x509: certificate signed by unknown authority error on Applications/Sources page | ||
|
||
- You may have an old load balancer from a previous cluster, delete it and recreate the cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"recreate the cluster" is a pretty harsh fix?
- Bootstrapping may have failed | ||
- No ClusterBootstrapConfiguration may be loaded into the cluster | ||
- Check the github repo to see if flux has made a commit to bootstrap the new cluster | ||
- Check the logs of the pods of the bootstrap job. They are named `default/run-gitops-${cluster-name}`, flux may have failed to clone the repo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bootstrap jobs are created in the same name space as of CAPI cluster namespace
|
||
Lets work through some common steps when a cluster is not behaving as expected | ||
|
||
## Pull request did not appear in github/gitlab |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could github/gitlab permissions also affect this?
@foot Do you still want to go forward with this? |
Why was this change made?
To help users figure out cluster issues when someone is not around to chat to