Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds first draft at a cluster debugging page #2736

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

foot
Copy link
Contributor

@foot foot commented Sep 14, 2022

  • Could do with some more examples and screenshots and flowcharts

Why was this change made?

To help users figure out cluster issues when someone is not around to chat to

- Could do with some more examples and screenshots
@foot foot changed the title Adds first draft a cluster debugging page Adds first draft at a cluster debugging page Sep 14, 2022
- Check that the path the cluster definition was merged to is being reconciled by flux
- Check that there were no errors in the kustomization resource that applied the cluster definition
- There may be a k8s validation error like a bad namespace in the cluster definition
- The CAPI provider may not be installed
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- The CAPI provider may not be installed
- The CAPI provider may not be installed resulting in a missing CRD error

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would they find this?

- Some providers like CAPD can be quite sensitive to your docker state
- make sure you don't have a lot of other old clusters running.
- Try a different cluster name, some old resources may not been cleaned up
- No CNI may have been installed on your clusters. Make sure a ClusterResourceSet is configured to do this.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can link to the documentation for this?


- Check the logs of the capi controllers, it may be failing to create the cluster
- Some providers like CAPD can be quite sensitive to your docker state
- make sure you don't have a lot of other old clusters running.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we provide instructions?

- Check the logs of the capi controllers, it may be failing to create the cluster
- Some providers like CAPD can be quite sensitive to your docker state
- make sure you don't have a lot of other old clusters running.
- Try a different cluster name, some old resources may not been cleaned up
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can they check this?

- Check that the path the cluster definition was merged to is being reconciled by flux
- Check that there were no errors in the kustomization resource that applied the cluster definition
- There may be a k8s validation error like a bad namespace in the cluster definition
- The CAPI provider may not be installed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we link to the documentation that explains how to install CAPI providers?


## Cluster does not appear in the UI after merging the PR

- Check that the path the cluster definition was merged to is being reconciled by flux
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we provide instructions here?

## Cluster does not appear in the UI after merging the PR

- Check that the path the cluster definition was merged to is being reconciled by flux
- Check that there were no errors in the kustomization resource that applied the cluster definition
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we provide an appropriate kubectl command for users to run?

- Check that the path the cluster definition was merged to is being reconciled by flux
- Check that there were no errors in the kustomization resource that applied the cluster definition
- There may be a k8s validation error like a bad namespace in the cluster definition
- The CAPI provider may not be installed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would they find this?


## Cluster does not transition to ready

- Check the logs of the capi controllers, it may be failing to create the cluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's include an appropriate kubectl command to get these logs?


## x509: certificate signed by unknown authority error on Applications/Sources page

- You may have an old load balancer from a previous cluster, delete it and recreate the cluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"recreate the cluster" is a pretty harsh fix?

- Bootstrapping may have failed
- No ClusterBootstrapConfiguration may be loaded into the cluster
- Check the github repo to see if flux has made a commit to bootstrap the new cluster
- Check the logs of the pods of the bootstrap job. They are named `default/run-gitops-${cluster-name}`, flux may have failed to clone the repo.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bootstrap jobs are created in the same name space as of CAPI cluster namespace


Lets work through some common steps when a cluster is not behaving as expected

## Pull request did not appear in github/gitlab
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could github/gitlab permissions also affect this?

@lasomethingsomething
Copy link
Contributor

@foot Do you still want to go forward with this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants