diff --git a/Autoscaler101/what-are-autoscalers.md b/Autoscaler101/what-are-autoscalers.md index cf4c5e91..baa1e1fd 100644 --- a/Autoscaler101/what-are-autoscalers.md +++ b/Autoscaler101/what-are-autoscalers.md @@ -16,6 +16,18 @@ A horizontal pod autoscaler works in the same way as a VPA for the most part. It Scaling down is handled in roughly the same way. When scaling down, HPA reduces the number of pod replicas. It terminates existing pods to bring the number of replicas in line with the configured target metric. The scaling decision is based on the comparison of the observed metric with the target value. HPA does not modify the resource specifications (CPU and memory requests/limits) of individual pods. Instead, it adjusts the number of replicas to match the desired metric target. -Now that we have thoroughly explored both types of autoscalers, let's go on to a lab where we will look at the scalers in more detail. +Before we go into the lab, since we are talking about metrics, let's take a breif look at the quality of service classes for pod metrics: + +## Quality of Service classes + +In Kubernetes, Guaranteed, Burstable, and BestEffort are Quality of Service (QoS) classes that define how pods are treated in terms of resource allocation and management. These classes help Kubernetes prioritize and manage workload resources effectively. Here's what each term means: + +**Guaranteed**: Pods with Guaranteed QoS are allocated the amount of CPU and memory resources they request, and these resources are guaranteed to be available when needed. Kubernetes reserves resources for pods with Guaranteed QoS, ensuring that they will not be throttled or terminated due to resource shortages. Pods in this class are expected to consume all the resources they request, so they must be careful with their resource requests to avoid wasting resources. + +**Burstable**: Pods with Burstable QoS may use more resources than they request, but only up to a certain limit. Kubernetes allows pods in this class to use additional CPU and memory resources if they're available, but there's no guarantee that these resources will always be available. Pods in this class may be throttled or terminated if they exceed their resource limits and there's contention for resources with other pods. + +**BestEffort**: Pods with BestEffort QoS have the lowest priority for resource allocation.These pods are not guaranteed any specific amount of CPU or memory resources, and they are the first to be evicted if the node runs out of resources.BestEffort pods are typically used for non-critical workloads or background tasks that can tolerate resource contention or occasional interruptions. + +Now that we have thoroughly explored both types of autoscalers and taken a breif look at how QoS classes work, let's go on to a lab where we will look at the scalers in more detail. [Next: Autoscaler lab](../Autoscaler101/autoscaler-lab.md) \ No newline at end of file diff --git a/Helm101/helm-charts.md b/Helm101/helm-charts.md index b34b2c6e..ad809f61 100644 --- a/Helm101/helm-charts.md +++ b/Helm101/helm-charts.md @@ -99,6 +99,6 @@ This template can then be used within all helm charts: .dockerconfigjson: {{ template "imagePullSecret" . }} ``` -This covers the basics of Helm charts, should you need to create one. However, only narrowly covers the full breadth of what Helm has to offer. For more tips and tricks, visit Helm [official docs](https://helm.sh/docs/howto/charts_tips_and_tricks/). Now, let's move on to Chart hooks. +This covers the basics of Helm charts, should you need to create one. However, only narrowly covers the full breadth of what Helm has to offer. For more tips and tricks, visit Helm [official docs](https://helm.sh/docs/howto/charts_tips_and_tricks/). In this section, we briefly touched up on Helm templates as a means to start off the creation of your new Helm chart. However, templates can be a really powerful tool if you want to reduce repetition in your deployment manifests. Therefore, in the next section, we will be diving deep into creating your own helm templates. -[Next: Chart Hooks](chart-hooks.md) \ No newline at end of file +[Next: Helm templates](./helm-templates.md) \ No newline at end of file diff --git a/Helm101/helm-templates.md b/Helm101/helm-templates.md new file mode 100644 index 00000000..b2357c98 --- /dev/null +++ b/Helm101/helm-templates.md @@ -0,0 +1,171 @@ +# Helm templates + +In even a small-scale organization, you would have at least a couple of applications that work together inside a Kubernetes cluster. This means you would have a minimum of 5-6 microservices. As your organization grows, you could go on the have 10, then 20, even 50 microservices, at which point a problem arises: the deployment manifests. Handling just one or two is fairly simple, but when it comes to several dozen, updating and adding new manifests can be a real problem. If you have a separate git repository for each microservice, you will likely want to keep each deployment yaml within the repo. If this is a regular organization that follows best practices, you will be required to create pull requests and have them reviewed before you merge to master. This means if you want to do something as simple as change the image pull policy for several microservices, you will have to make the change in each repo, create a pull request, have it reviewed by someone else, and then merge the changes. This is a pretty large number of steps that a Helm template can reduce to just 1. + +To start, we will need a sample application. We could use the same charts that we used in the previous section, but instead let's go with a new application altogether: nginx. + +This will be our starting point: + +``` +# nginx-deployment.yaml + +apiVersion: apps/v1 +kind: Deployment +metadata: + name: nginx-deployment + labels: + app: nginx +spec: + replicas: 3 + selector: + matchLabels: + app: nginx + template: + metadata: + labels: + app: nginx + spec: + containers: + - name: nginx + image: nginx:latest + ports: + - containerPort: 80 +``` + +``` +# nginx-service.yaml + +apiVersion: v1 +kind: Service +metadata: + name: nginx-service +spec: + selector: + app: nginx + ports: + - protocol: TCP + port: 80 + targetPort: 80 + type: ClusterIP +``` + +The above is a rather basic implementaion of an nginx server with 3 replicas, and allows connections on port 80. For starters, let's create a Helm chart from this nginx application. + +For starters, let's create the Helm chart. Go into a folder you plan to run this from and type: + +``` +helm create nginx-chart +``` + +This will create a chart with the basic needed files. The directory structure should look like this: + +``` +nginx-chart/ +├── Chart.yaml +├── templates +│ ├── deployment.yaml +│ └── service.yaml +└── values.yaml +``` + +By looking at the above structure, you should be able to see where the deployment and service yamls fit in. You will see that there are sample yamls created here. However, you will also notice that these yamls are go templates which have placeholders instead of hardcoded values. We will be converting our existing yamls into this format. But first, update the Chart.yaml file to include relevant metadata for nginx if you require so. Generally, the default Chart.yaml is fine. You can also optionally modify values.yaml. Things such as the number of replicas can be managed here. + +Next, we get to the templating part. We will have to convert our existing deployment yaml into a Helm template file. This is what the yaml would look like after it is converted: + +``` +# templates/deployment.yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: {{ .Release.Name }}-nginx-deployment + labels: + app: nginx +spec: + replicas: {{ .Values.nginx.replicaCount }} + selector: + matchLabels: + app: nginx + template: + metadata: + labels: + app: nginx + spec: + containers: + - name: nginx + image: "{{ .Values.nginx.image.repository }}:{{ .Values.nginx.image.tag }}" + ports: + - containerPort: {{ .Values.nginx.containerPort }} +``` + +The first thing to change is the naming convention: In the metadata.name field, {{ .Release.Name }}- has been added to prefix the deployment name. This ensures that each deployment has a unique name when installed via Helm, with .Release.Name representing the release name generated by Helm. The replica count has been replaced with {{ .Values.nginx.replicaCount }}. This allows the user to set the number of replicas in the values.yaml file of the Helm chart. When it comes to the image tag and repository, the hardcoded image name nginx:latest has been replaced with {{ .Values.nginx.image.repository }}:{{ .Values.nginx.image.tag }}. This allows the user to specify the image repository and tag in the values.yaml file. Finally, the container port's hardcoded port 80 has been replaced with {{ .Values.nginx.containerPort }}, allowing the user to specify the container port in the values.yaml file. + +These changes make the Helm template more flexible and configurable, allowing you to customize the deployment according to their requirements using the values.yaml file. Now let's take a look at the service yaml and how it would look after it is converted: + +``` +# templates/service.yaml +apiVersion: v1 +kind: Service +metadata: + name: {{ .Release.Name }}-nginx-service +spec: + selector: + app: nginx + ports: + - protocol: TCP + port: {{ .Values.nginx.servicePort }} + targetPort: {{ .Values.nginx.containerPort }} + type: {{ .Values.nginx.serviceType }} + +``` + +Similar to the deployment template, the service name has been replaced with {{ .Release.Name }}- to ensure uniqueness when installed via Helm. For the service port, the hardcoded service port 80 has been changed to {{ .Values.nginx.servicePort }}. This allows you to specify the service port in the values.yaml file. We also replaced the hardcoded target port 80 with {{ .Values.nginx.containerPort }}, allowing you to specify the target port in the values.yaml file. This should match the container port defined in the deployment template. For the service type we replaced the hardcoded service type ClusterIP with {{ .Values.nginx.serviceType }}, allowing users to specify the service type in the values.yaml file. This provides flexibility in choosing the appropriate service type based on the environment or requirements. + +Now that we have defined both the deployment and the service in a template format, let's take a look at what the overriding values file would look like: + +``` +nginx: + replicaCount: 3 + image: + repository: nginx + tag: latest + containerPort: 80 + servicePort: 80 + serviceType: ClusterIP +``` + +In this values.yaml file, the replicaCount specifies the number of replicas for the nginx deployment image.repository and image.tag specify the Docker image repository and tag for the nginx container. The containerPort specifies the port on which the nginx container listens and servicePort specifies the port exposed by the nginx service. Finally, the serviceType specifies the type of Kubernetes service to create for nginx. You might want to change this to NodePort or LoadBalancer if you plan to provide external access (or use kubectl port forwarding). + +With this structure, users can now install your Helm chart, and they'll be able to customize the number of replicas and the Nginx image tag through the values.yaml file. Let's go ahead and do the install using the below command: + +``` +helm install my-nginx-release ./my-nginx-chart --values values.yaml +``` + +Make sure you run the above command in the same directory as the values.yaml. This will create a release called "my-nginx-release" based on the chart overriding the values.yaml in your Kubernetes cluster. You should be able to run and test the Nginx server that comes up as a result. However, you will notice that we have gone out of our way to define templates and overriding files for something that a simple yaml file could have accomplished. There is more code now than before. So what is the advantage? + +For starters, you get all the perks that come with using Helm charts. But now you also have a template you can use to generate additional helm releases. For example, if you want to run another Nginx server with different arguments (different number of replicas, a different image version, different port, etc...), you can use this template. This is especially true if you are working in an organization that has multiple services that require different Nginx setups. You could even consider a situation where your organization has 10+ microservices where the pods you spin up for each microservice are largely boilerplate. The only things that would likely change are the names of the microservice and the image that would spin up in the container. In a situation like this, you could easily create a values file that has a handful of lines that override the Helm template. + +Let's try this. Create a new values-new.yaml and set the below values: + +``` +nginx: + replicaCount: 2 + image: + repository: nginx + tag: alpine3.18-perl + containerPort: 80 + servicePort: 8080 + serviceType: ClusterIP +``` + +The new yaml has a changed replica count, gets a different image tag, and serves on port 8080 instead of 80. In order to deploy this, you can use the same + +``` +helm install my-nginx-release-alpine ./my-nginx-chart --values values-new.yaml +``` + +The release name and the yaml that gets picked up need to be changed here. In this same way, you could create different values.yaml with different overriding properties and end up with an infinite number of nginx servers, each with different values. + +This brings us to the end of the section on the powerful use of Helm templates. Now, let's move on to Chart hooks. + +[Next: Chart Hooks](chart-hooks.md) \ No newline at end of file diff --git a/MultiTenant101/what-is-tenancy.md b/MultiTenant101/what-is-tenancy.md new file mode 100644 index 00000000..40b68989 --- /dev/null +++ b/MultiTenant101/what-is-tenancy.md @@ -0,0 +1,143 @@ +# Multi-tenant architecture + +A multi-tenant architecture is when you segregate your infrastructure based on your clients. A single instance of the software application is served to multiple customers, known as tenants. Each tenant's data and configuration are logically isolated from each other, but they share the same underlying infrastructure and codebase. + +This can have many benefits, but whether you should go for an infrastructure like this is dependent on several things. Let's start off by looking at the immediate advantages of this design. + +First and foremost is resource isolation. Not only does this improve security and compliance for the organization, but it also enables you to get a better idea of the resource usage and related costs that each client incurs for your company. You can use this information to better improve billing. If a single client starts growing, you can scale out that particular client's infrastructure without having to scale out everything while charging them to match. If you have a large number of small clients (individual customers), you could even provision resources on a per-customer basis. This is not in the scope of this lesson but is covered in the [Kubezoo](../Kubezoo/what-is-kubezoo.md) section. + +The next is customization. It is normal for different customers to have different requirements, and a regular architecture would not be able to cater to that. With a multi-tenant system, since each customer's applications are isolated from one another, providing customization on a per-client basis becomes pretty easy. + +Of course, all this doesn't mean you need to have extra computing power. Despite having a separate infrastructure per client, you will still use the same resources for all of them. In this section, we will be looking at how we can use a single cluster with one control plan and multiple worker nodes to set up a multi-tenant system that is broken down based on namespaces. But before we do that, let's consider when muti-tenancy might not be the best option. + +First off, if your project is specifically designed for a single customer, then there is no point in spending time coming up with a new multi-tenant architecture since you will only have a single tenant. This does not apply if you plan to re-use your application and provide it to a different client. Additionally, if you have a large number of small clients already configured, switching to a multi-tenant architecture might not make a lot of sense considering the large amount of work it will require. In fact, it's likely that your application code isn't designed to handle workloads in a multi-tenant fashion, which means doing a fair bit of additional work to make your code fit the architecture. Finally, if you don't already have a microservice architecture, then you will have to largely scrap the idea of running different groups of clients in different infrastructures since it will get fairly expensive. + +## Design + +Now that we've got the introduction out of the way, let's take a look at the design we will be following for this application. + +### 1. Namespace Isolation: +Each tenant will have its dedicated namespace in Kubernetes. This allows for resource isolation and management at the namespace level. + +### 2. RBAC (Role-Based Access Control): +Implement RBAC to control access to resources within each namespace. Define roles and role bindings to restrict what actions users and services can perform within their respective namespaces. + +### 3. Network Policies: +Use network policies to control network traffic between namespaces. Define policies to allow communication between services within the same tenant namespace while restricting traffic from other tenants. + +### 4. Resource Quotas and Limits: +Set resource quotas and limits for each namespace to prevent one tenant from monopolizing resources and affecting others. This ensures fair resource allocation and prevents noisy neighbors. + +### 5. Custom Resource Definitions (CRDs): +This is something that we will not be touching on in this section, but if your tenants require custom resources, CRDs are the way to go. This allows tenants to define their resource types and controllers within their namespaces. + +### 6. Monitoring and Logging: +There are a large number of tools that can help with monitoring and logging, and most of them such as Prometheus, Grafana, Elasticsearch, Fluentd, and Kibana (EFK stack) have been covered in other sections. Since we will be separating tenants by namespace, a tool we haven't looked at so far that could be useful here is [Loki](https://grafana.com/docs/loki/latest/) for namespace-level logging. + +### 7. Tenant Onboarding and Offboarding Automation: +This step is something that is generally overlooked and is important even if you aren't developing a multi-tenant system. You have to consider how you will handle things when you onboard a new customer. What type of scaling will you use? How much will you scale? If you don't have this, you might end up either over-provisioning or under-provisioning and thereby affect your clients. So you have to develop automation scripts or tools for efficient tenant onboarding and offboarding. This includes provisioning/de-provisioning namespaces, setting up RBAC rules, configuring network policies, and applying resource quotas. + +### 8. Tenant Customization: +We spoke earlier about how it was easy to have customized versions of applications per customer when you are running a multi-tenant application. However, you can take this a step further and allow the tenant to customize their namespaces within defined limits. Provide options for configuring ingress/egress rules, setting up persistent storage, deploying custom services, etc. This allows your tenant to control not only the application but also their infrastructure to a certain level. + +### 9. High Availability and Disaster Recovery: +A disaster recovery solution is pretty important when you have multiple customers using a single infrastructure. If you had each tenant using a different Kubernetes cluster, for example, one cluster going down is only going to affect that customer. However, if you make all the tenants use the same cluster with namespace separation, the cluster going down could mean all your tenants are affected. So as part of the architecture, you have to always think about redundancy and failover mechanisms at both the cluster and application levels to ensure high availability. You also have to regularly backup tenant data and configuration to facilitate disaster recovery. + +### 10. Scalability: +This is something that you really need to focus on when running your application on Kubernetes. With other infrastructure like static instances, you would find it pretty difficult to finely scale your infrastructure to match the needs of your workloads, but this is easily doable with Kubernetes. Make sure that you are running the correct nodes with the most efficient amount of resources that your tenants and workloads need. Design the application to be horizontally scalable to accommodate varying tenant loads. Utilize Kubernetes features like Horizontal Pod Autoscaler (HPA) and Cluster Autoscaler to automatically scale resources based on demand. You could also use tools such as [KEDA](../Keda101/what-is-keda.md) to scale based on all sorts of metrics, or tools like Karpenter to scale your infrastructure itself so that the nodes come in sizes that match your workloads. + +Now that we have looked at everything we need to consider before we implement our multi-tenant architecture, let's look at something we need to consider after the architecture has been set up: an example workflow. Getting some idea about the workflow before you start implementing your system is crucial so you don't end up going back repeatedly because you missed something. A rough example workflow would be like this: + +### Example Workflow: +1. Tenant requests a new environment. +2. Automation scripts provision a new namespace for the tenant. +3. RBAC rules, network policies, and resource quotas are applied. +4. Tenant deploys their application within the designated namespace. +5. Monitoring and logging capture of relevant metrics and events. +6. Regular audits ensure compliance and security. +7. Tenant scales resources as needed using Kubernetes APIs. + +In addition to the above seven steps, it's also good to consider what you need to do when offboarding a tenant. Upon tenant offboarding, automation scripts will have to handle namespace cleanup and resource deallocation. + +This design ensures efficient resource utilization, strong isolation between tenants, and streamlined management of a multi-tenant Kubernetes environment. + +## Implementation + +Now that we have gone through the design of our multi-tenant application, let's create a simple Kubernetes application using NGINX as the sample application. We'll deploy NGINX within a Kubernetes cluster, ensuring that each tenant gets their isolated namespace. + +### Requirements: +You will need a Kubernetes cluster. As always, we recommend [minikube](https://minikube.sigs.k8s.io/docs/start/). You also need to have kubectl installed. + +### Steps: +Since we are going to be dividing tenants based on namespaces, let's begin by creating the namespaces: + +```bash +kubectl create namespace tenant1 +kubectl create namespace tenant2 +``` + +Next, let's deploy our application. In this case, we will use a basic nginx image and assume that we are setting up two separate nginx services for the two tenants. We could use a deployment file, but for the sake of simplicity, we will use a single-line command. We will have to run the command for both namespaces: + +```bash +# Deployment for tenant1 +kubectl create deployment nginx --image=nginx -n tenant1 + +# Deployment for tenant2 +kubectl create deployment nginx --image=nginx -n tenant2 +``` + +We will now expose the services on port 80 for both tenants using the [kubectl expose](https://kubernetes.io/docs/reference/kubectl/generated/kubectl_expose/) command: + +```bash +# Expose NGINX service for tenant1 +kubectl expose deployment nginx --port=80 --target-port=80 -n tenant1 + +# Expose NGINX service for tenant2 +kubectl expose deployment nginx --port=80 --target-port=80 -n tenant2 +``` + +Next, let's get the deployments and pods to make sure that everything was deployed correctly in both namespaces: + +```bash +# Check tenant1 deployment +kubectl get deployment,pods -n tenant1 + +# Check tenant2 deployment +kubectl get deployment,pods -n tenant2 +``` + +Now get the svc from both namespaces. In a production environment, you would be attaching a load balancer to each of the services, and then routing a DNS entry into each load balancer. This way, tenant 1 would access their part of the system using the tenant 1 DNS while tenant 2 would do the same with the tenant 2 DNS. Any supporting microservices would then be deployed into their relevant namespaces and would not interact with the namespaces or modules of other tenants. However, tenants will be sharing resources. If you have a very important tenant, you could specify a nodegroup just for them and have all their microservices exclusively get scheduled on that nodegroup. If there is no such case, you could just use a single nodegroup where all your tenants use the same infrastructure. This will be the most cost-efficient method since you not only use the same cluster but also the same underlying resources. However, this brings up the problem of a noisy neighbor. So let's discuss that. + +## Noisy neighbour + +The concept of a "noisy neighbor" refers to a situation where one tenant's workload consumes an excessive amount of shared cluster resources, adversely impacting the performance and stability of other tenants' workloads. This can happen due to various reasons such as poorly optimized applications, resource-intensive tasks, or misconfigurations. + +Since in this case, we are segregating tenants based on namespace, each namespace provides a segregated environment where tenants can deploy their applications and manage resources independently. However, without proper resource management, a noisy neighbor within a namespace can still affect the overall performance of the cluster. + +Now that we've looked at what the problem is, let's consider some solutions: + +1. **Resource Quotas**: Since the tenants are split by namespaces, the easiest method to distribute processing power is to define resource quotas at the namespace level, limiting the amount of CPU, memory, and other resources that can be consumed by the workloads within that namespace. By setting appropriate quotas, you can prevent any single tenant from monopolizing the cluster resources. However, if one of your tenants who is not a generally noisy tenant ends up having a spike in traffic, your limitations might result in the tenants' application slowing down. Additionally, you will likely not have all the tenants coming in with the same amount of traffic, which means you will have to get an idea of what type of resource quotas each tenant needs. + +2. **Resource Limits**: In addition to quotas, you can set resource limits at the pod or container level. This ensures that individual workloads cannot exceed a certain threshold of resource usage, preventing them from becoming noisy neighbors. This has the same implications and drawbacks as the previously mentioned point. + +3. **Horizontal Pod Autoscaling (HPA)**: Implementing HPA allows Kubernetes to automatically scale the number of pod replicas based on resource usage metrics such as CPU or memory consumption. This helps in distributing the workload more evenly across the cluster, reducing the impact of noisy neighbors. This will mean that you spread out your workloads across multiple nodes in a nodegroup, or even across several nodegroups. + +4. **Quality of Service (QoS)**: Kubernetes offers three QoS classes for pods: Guaranteed, Burstable, and BestEffort. By categorizing pods based on their resource requirements and behavior, you can prioritize critical workloads over less important ones, mitigating the effects of noisy neighbors. To read more about QoS, check the [Autoscaler101 section](../Autoscaler101/autoscaler-lab.md). + +5. **Isolation**: This is the option we discussed during the implementation section. You can isolate your tenants onto their own nodegroups and have their resources largely isolated from each other. + +6. **Monitoring and Alerting**: Sometimes, the reason why a client's load is suddenly spiking may not be intentional, or the client might have no idea that there is a spike in traffic during that time. For cases like this, and as a practice in general, it is best to have monitoring enabled across the entire system. Tools like Prometheus and Grafana are the go-to solutions for things like this since they are open-source and well-documented. If you have implemented a service mesh such as Linkerd into your system, you could use the mesh dashboard to get an idea of the amount of traffic going into each of the pods. Depending on your organization's budget and the importance of monitoring, you could go for more advanced tools such as New Relic which can give you an hour-by-hour overview of your entire system. + +By implementing these strategies and continuously optimizing resource allocation and utilization, you can effectively mitigate the impact of noisy neighbors in a Kubernetes multi-tenant environment, ensuring fair resource sharing and optimal cluster performance for all tenants. + +## Summary + +- **Step 1**: We create separate namespaces for each tenant (`tenant1` and `tenant2`). +- **Step 2**: NGINX is deployed within each tenant's namespace using a Kubernetes Deployment. +- **Step 3**: We expose the NGINX service within each namespace. +- **Step 4**: Verification of deployments, services, and pods within each namespace. +- **Step 5**: Accessing NGINX services via their respective service URLs. + +This example demonstrates how to deploy a simple application (NGINX) within a multi-tenant Kubernetes environment, ensuring isolation between tenants using namespaces. Each tenant has its NGINX instance running independently within its namespace. + +We have now covered the theory as well as a very basic example of how a muti-tenant architecture works in a Kubernetes environment. Naturally, when this extends to large organizations dealing with hundreds of clients, the system becomes much more complicated but this is just a peek. \ No newline at end of file