This repository has been archived by the owner on Jun 8, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 80
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #23 from ryanzhang-oss/add-design
add trait workload interaction design
- Loading branch information
Showing
1 changed file
with
204 additions
and
0 deletions.
There are no files selected for viewing
204 changes: 204 additions & 0 deletions
204
design/one-pager-trait-workload-interaction-mechanism.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,204 @@ | ||
# Traits and workloads interaction mechanism in OAM | ||
|
||
* Owner: Ryan Zhang (@ryanzhang-oss) | ||
* Reviewers: Crossplane Maintainers | ||
* Status: Draft | ||
|
||
## Terminology | ||
|
||
* **CRD (Custom Resource Definition)** : A standard Kubernetes Custom Resource Definition | ||
* **CR (Custom Resource)** : An instance of a Kubernetes type that was defined using a CRD | ||
* **GVK (Group Version Kind)** : The API Group, Version, and Kind for a type of Kubernetes | ||
resource (including CRDs) | ||
* **Workload child resources** : The Kubernetes resources generated by a workload controller. They | ||
should all have a controller reference pointing to the parent workload instance. | ||
|
||
## Background | ||
Traits and workloads are two major types of resources in OAM. Traits usually affect how a | ||
kubernetes resource operate either directly (through spec change) or indirectly (add ingress or | ||
sidecar). However, the current OAM implementation does not contain a generic mechanism for traits | ||
to locate the corresponding resource to modify. | ||
|
||
We will use the following hypothetical OAM application as the baseline to illustrate the problem | ||
and our solution. | ||
|
||
```yaml | ||
apiVersion: core.oam.dev/v1alpha2 | ||
kind: WorkloadDefinition | ||
metadata: | ||
name: mydbs.standard.oam.dev | ||
spec: | ||
definitionRef: | ||
name: mydbs.standard.oam.dev | ||
--- | ||
apiVersion: core.oam.dev/v1alpha2 | ||
kind: TraitDefinition | ||
metadata: | ||
name: manualscalertraits.core.oam.dev | ||
spec: | ||
definitionRef: | ||
name: manualscalertraits.core.oam.dev | ||
--- | ||
apiVersion: core.oam.dev/v1alpha2 | ||
kind: Component | ||
metadata: | ||
name: example-db | ||
spec: | ||
workload: | ||
apiVersion: standard.oam.dev/v1alpha2 | ||
kind: Mydb | ||
metadata: | ||
name: mydb-example | ||
spec: | ||
containers: | ||
- name: mysql | ||
image: mysql:latest | ||
--- | ||
apiVersion: core.oam.dev/v1alpha2 | ||
kind: ApplicationConfiguration | ||
metadata: | ||
name: example-appconfig | ||
spec: | ||
components: | ||
- componentName: example-db | ||
traits: | ||
- trait: | ||
apiVersion: core.oam.dev/v1alpha2 | ||
kind: ManualScalerTrait | ||
metadata: | ||
name: example-appconfig-trait | ||
spec: | ||
replicaCount: 3 | ||
``` | ||
The problem is two folds | ||
1. A trait controller needs a way to find the workload that it is applied to. | ||
- In the example, the manual scalar trait needs to know that it is supposed to scale the | ||
example-db workload. However, we want to keep the applicationConfiguration controller | ||
agnostic to the schema of any `trait` or `workload` it generates to make it extensible. | ||
Thus, the applicationConfiguration controller needs to emit a `ManualScalerTrait` CR that | ||
contains a reference to the `example-db` workload without knowing the trait's specific schema. | ||
|
||
2. A trait controller needs to know the exact resources it should modify. Note that these | ||
resources are most likely not the workload itself. | ||
- Use the same example, just knowing the `example-db` workload is not enough for the | ||
`ManualScalerTrait` to work. The trait controller does not work with the `example-db` workload | ||
directly. It needs to find the actual Kubernetes resources that the `example-db` | ||
workload generates and then it can modify the `replica` field in its spec. | ||
|
||
## Goals | ||
In order to maximize the extensibility of our OAM implementation, our solution need to meet the | ||
following two design objections. | ||
1. **Extensible trait system**: We want to allow a `trait` to apply to any eligible `workload` | ||
instead of just a list of specific ones. This means that we want to empower a trait developer to | ||
write the controller code once, and it will work for any new `workload` that this `trait` can | ||
apply to in the future. | ||
- Using the example again, the `ManualScalerTrait` should work with any workload that | ||
generates a Kubernetes resource that has a `replica` field in its spec even if the | ||
workload does not exist when the `ManualScalerTrait` is implemented. | ||
2. **Adopting existing CRDs**: The mechanism cannot put any limit on the `trait` or `workload | ||
` CRDs. This means that we cannot assume any pre-defined CRD fields in any `trait` or `workload | ||
` beyond Kubernetes conventions (i.e. spec or status). | ||
- For example, the following `EtcdBackup` operator can be used as a `trait` in an OAM | ||
application to apply to an `EtcdCluster` workload. Here, the `etcdEndPoints` field in | ||
the trait signals to which `workload` it applies, and we need to accommodate | ||
this type of `trait`. | ||
```yaml | ||
apiVersion: "etcd.database.coreos.com/v1beta2" | ||
kind: "EtcdBackup" | ||
metadata: | ||
name: example-etcd-cluster-backup | ||
spec: | ||
etcdEndpoints: [<etcd-cluster-endpoints>] | ||
storageType: S3 | ||
s3: | ||
path: <full-s3-path> | ||
awsSecret: <aws-secret> | ||
``` | ||
|
||
|
||
## Proposal | ||
The overall idea is for the applicationConfiguration controller to fill critical information | ||
in the workload and trait CR it emits. In addition, we will provide a helper library so that | ||
trait controller developers can locate the resources they need with a simple function call. | ||
Here is the list of changes that we propose. | ||
1. ApplicationConfig controller no longer assumes that all `trait` CRDs contain a "spec | ||
.workloadRef" field conforms to the OAM definition. It only fills the workload GVK to a `trait | ||
` CR if its CRD has a "spec.workloadRef" field defined as below. | ||
```yaml | ||
workloadRef: | ||
properties: | ||
apiVersion: | ||
type: string | ||
kind: | ||
type: string | ||
name: | ||
type: string | ||
required: | ||
- apiVersion | ||
- kind | ||
- name | ||
type: object | ||
``` | ||
2. Add a `childResourceKinds` field in the WorkloadDefinition. | ||
Currently, a workloadDefinition is nothing but a shim of a real workload CRD. We propose to add | ||
an **optional** field called `childResourceKinds` to the schema of the workloadDefinition. We encourage | ||
workload owners to fill in this field when they register their controllers to the OAM system. | ||
This is the way for them to declare the types of the Kubernetes resources their workload | ||
controller actually generates. In our example, the workload definition can claim to generate | ||
deployment and service child resources. | ||
```yaml | ||
apiVersion: core.oam.dev/v1alpha2 | ||
kind: WorkloadDefinition | ||
metadata: | ||
name: mydb.standard.oam.dev | ||
spec: | ||
definitionRef: | ||
name: mydb.standard.oam.dev | ||
childResourceKinds: | ||
- apiVersion: apps/v1 | ||
kind: Deployment | ||
- apiVersion: v1 | ||
kind: Service | ||
``` | ||
3. OAM runtime will provide a helper library. The library follows the following logic to help a | ||
trait developer locate the resources for the trait to modify. | ||
1. Get the corresponding `workload` instance from the Kubernetes cluster with the information | ||
inserted by the application controller in the `trait` CR. | ||
2. Fetch the corresponding `workloadDefinition` CR following an | ||
[OAM convention](https://github.com/oam-dev/spec/blob/master/3.workload.md#definitionref). | ||
The convention requires that the name of the `workloadDefinition` CR is the name of the | ||
`workload` CRD it refers to. For example, the name of the `workloadDefinition` CR | ||
that refers to a `containerizedworkloads.core.oam.dev` CRD is exactly | ||
`containerizedworkloads.core.oam.dev` as well. | ||
3. Fetch all the `childResourceKinds` values in the corresponding`workloadDefinition` instance. | ||
4. List each child resource by its GVK and filter by owner reference. Here, we assume that | ||
all the child resources that the workload controller generates have an controller reference | ||
field pointing back to the workload instance. | ||
|
||
## Impact to the existing system | ||
Here are the impacts of this mechanism to the existing OAM components | ||
- ApplicationConfiguration: This mechanism requires minimum changes in the | ||
applicationConfiguration controller. | ||
- Workload: This mechanism does not affect workload controller implementation. | ||
- Trait: This mechanism is optional so all existing trait controller still works. This mechanism | ||
requires modification to any existing trait that wants to take advantage of | ||
extensibility of OAM. Any trait that only applies to a certain type of workload, such as | ||
`EtcdBackup` trait, doesn't need to use this mechanism. | ||
- WorkloadDefinition: workload owners can modify the existing workloadDefinition if needed. | ||
|
||
## Alternative approach | ||
1. One alternative approach is that we can make the applicationConfiguration controller watch all | ||
the possible workload child resources. It also inserts the child resources GVK and name to the | ||
corresponding workload CR. I would not recommend this approach as it increases the complexity | ||
of the applicationConfiguration controller and makes more of availability liability. | ||
2. Another approach is to implement a separate type for binding traits to workloads. This would | ||
work, but it seems that label/annotation is a natural place to record the information. Otherwise | ||
, we need a way for the trait to discover the binding instance first. | ||
|
||
## Extra labels | ||
There might be cases that a workload generates more than one resource with the same GVK and only | ||
want to expose a subset of them to traits. In this case, we can add a pre-defined label such as | ||
`core.oam.dev/expose=true` for the workload owner to indicate what resources to expose. This | ||
section is just to illustrate that this is a solvable problem, and it's beyond the scope of this | ||
proposal for now . |