Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide automated deployment of Azure resources used in end-to-end tests #77

Closed
tomkerkhove opened this issue Oct 3, 2022 · 30 comments
Closed
Assignees

Comments

@tomkerkhove
Copy link
Member

Provide automated deployment of Azure resources used in end-to-end tests with Bicep so that things are automated and I'm not the bottleneck (or at least less).

This is because our Azure subscription is not accessible to everyone and should be just a PR away.

@tomkerkhove
Copy link
Member Author

For Promitor I have an azure infrastructure repo to which contributors can PR new resources required for automated testing and is automatically deployed with GitHub Actions.

@JorTurFer
Copy link
Member

Is for all infra including AKS or only for resources like EventHub, ServiceBus, etc?

@tomkerkhove
Copy link
Member Author

tomkerkhove commented Oct 3, 2022

That's up to us to decide, we can just start with the Azure resources without the cluster if you prefer

@JorTurFer
Copy link
Member

My only concert is the time to create/delete and AKS cluster, if we need to do it, we will make test even longer

@JorTurFer
Copy link
Member

I'd start with upstreams

@JorTurFer
Copy link
Member

JorTurFer commented Oct 3, 2022

and maybe we should take a look to crossplane, we can use it for more cloud providers if it works with the infra we use.
Other point in favor of crosplanne is that we can spawn the infra as test code :)

@JorTurFer
Copy link
Member

sadly, they don't support queues and other resources we need for the moment 😢

@tomkerkhove
Copy link
Member Author

This would not run every test run; only when there are changes to the infrastructure definition

@JorTurFer
Copy link
Member

aaaah, your idea is having the infra there all the time, and update it on the fly only when needed. I thought you meant deploying/destroying it during the tests

@tomkerkhove
Copy link
Member Author

Yes, correct.

Doing the latter is more intensive and harder to get right. I think we can avoid that as we don't have the capacity for it.

@JorTurFer
Copy link
Member

For this scenario you were right, we can use terraform and manage all the infra from the same place.
It requires to store the tfstate in a storage but being stable environments this won't be any problem.

I can start with this during the week if we agree to use terraform for all (I don't know about biceps sorry xD)

I'd create a repo to manage the infra, something like 'keda-infrastructure' or jus 'infrastructure'.

Wdyt?

@tomkerkhove
Copy link
Member Author

tomkerkhove commented Oct 3, 2022

Bicep works fine but if you want to use this cross cloud then Terraform is OK.
If it's just Azure, just use Bicep IMO.

I'd introduce kedacore/testing-infrastructure for this. I'm happy to help if it's Bicep but haven't used Terraform before so would have to wait until the initial file is there unfortunately.

@JorTurFer
Copy link
Member

I have expertise with terraform, so I can create the scaffolding and the initial infrastructure, that's not a problem.

I'm thinking in what infra we have, and IDK if we need to cover AWS now because we create the infra during the e2e test and we delete them after it, so maybe we can go with biceps, but GCP has infra I need to review to check if we should cover it.

I said terraform because it's a single language to manage all the infra, so it's easier for people who doesn't know cloud provider specific language. There is also a bot for terraform that we could use to improve the experience, giving the plan outputs and other stuff https://github.com/runatlantis/atlantis

@tomkerkhove
Copy link
Member Author

Let's use Terraform in that case, we don't want to do a migration later on

@JorTurFer
Copy link
Member

I have checked and we can update the secrets by Secrets API, so we can get the terraform outputs and update the secrets directly in the org so they can be automatically managed, on every terraform executions, secrets.
OFC this is a draft, we need to go deeper, but it's promising and could improve new infra creation

@tomkerkhove
Copy link
Member Author

In theory, it's just going to spin up new resources and a manual action for secrets is fine IMO; at least for starters.

I don't want that process to mess up our GH secrets :)

@JorTurFer
Copy link
Member

JorTurFer commented Oct 3, 2022

The problem here is that secrets should be taken from somewhere in order to put them as secrets. If we go to the cloud provider and take from there, we still need access to Azure Subscription, so the blocker should be there.
I won't publish secrets as output in github, so the options are, push them somewhere like a vault all of us we can access or push them directly to GH or any vault and pull them from there in the workflow.

I have checked and there is a azure key vault integration for GH Actions, so we could put all the secrets from terraform directly in the vault and get them in the workload, but in that case, I prefer to use GH Secrets

@JorTurFer
Copy link
Member

BTW, We can name them as TF_CURRENT_ENV_NAME to know which of them are self generated and which manually generated. Once we have all of them working, we can just modify the secret we use in the workflows to don't touch current secrets

@jeffhollan
Copy link
Member

FYI - opened a ticket with CNCF for access (owner) to an Azure Subscription so we could run these kind of automated workloads where we want. My thinking is we could start small (just spinning up Azure Event Hubs / E2E tests) and start to move more of the workloads over time as we want https://cncfservicedesk.atlassian.net/servicedesk/customer/portal/1/CNCFSD-1422

@JorTurFer
Copy link
Member

you are right. for the moment, I'll start creating the scaffolding with a simple resource but with all the elements ready (terraform code/modules with a backend, secret management, docs, etc) and then we can move the services one by one.

To start I have my MVP subscription and once the scaffolding is ready, we can change the SP and use other account for this (MSFT or CNCF account, not to worry).

@tomkerkhove
Copy link
Member Author

@jeffhollan I can already tell you that they will not be able to help you :) I already looked in to this.

Please don't introduce yet-another subscription @JorTurFer and just use the existing one :)

@JorTurFer
Copy link
Member

Okey,
I said only during the scaffolding, once the things are working I wanted to swap from mine to current (because I have access to the UI to check how it's going and in case of the necessity of deleting something)

@jeffhollan
Copy link
Member

I'm naively going through the motions to see where this ends cncf/credits#23

@JorTurFer
Copy link
Member

I have one question here, are we going to make public the infra repo or it'll be only internal?
I ask because I'm working on it and depending on this, we need to think the CI checks for PRs (terraform checks requires secrets and PRs from forks can't access to secrets directly)

@tomkerkhove
Copy link
Member Author

Yes, it should be public so that every contributor can open a PR imo

@JorTurFer JorTurFer self-assigned this Oct 12, 2022
@JorTurFer
Copy link
Member

I think this is already done as we have moved the infrastructure management to https://github.com/kedacore/testing-infrastructure and it's already public, so any contributor can just open a PR there to create needed resources on Azure but also AWS and GCP (GCP is still in progress)

@JorTurFer JorTurFer moved this to Done in Governance Nov 15, 2022
@tomkerkhove
Copy link
Member Author

Job well done, thanks! 🎉

Can we add this new addition to the contribution guide please?

@tomkerkhove tomkerkhove reopened this Nov 15, 2022
@JorTurFer
Copy link
Member

JorTurFer commented Nov 15, 2022

The e2e readme in keda has a section about e2e infrastructure, and that repo has a readme with a brief description
Do you think that contribution guide is better to place it? I can move/duplicate it there

@JorTurFer
Copy link
Member

I have created an issue in test-tools repo to add documentation there because we don't have any guide or help

@tomkerkhove
Copy link
Member Author

Thanks a ton! I've noticed that contribution guide has link to test folder as well so we're good to go; thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

3 participants