Policy attributes design proposal (#214)

* wip: initial attribute design proposal Signed-off-by: Etai Lev Ran <[email protected]> * More info on retrieving and exchanging attrs Signed-off-by: ZIV NEVO <[email protected]> * Another iteration. - Expand description and text in sections - Add attribute table Signed-off-by: Etai Lev Ran <[email protected]> * clarify workload/service distinction Signed-off-by: Etai Lev Ran <[email protected]> * add k8s:sa attribute, reference common labels Signed-off-by: Etai Lev Ran <[email protected]> * another iteration Signed-off-by: ZIV NEVO <[email protected]> * wip: initial attribute design proposal Signed-off-by: Etai Lev Ran <[email protected]> * More info on retrieving and exchanging attrs Signed-off-by: ZIV NEVO <[email protected]> * Another iteration. - Expand description and text in sections - Add attribute table Signed-off-by: Etai Lev Ran <[email protected]> * clarify workload/service distinction Signed-off-by: Etai Lev Ran <[email protected]> * add k8s:sa attribute, reference common labels Signed-off-by: Etai Lev Ran <[email protected]> * fixed typo - missing word Signed-off-by: Etai Lev Ran <[email protected]> * Services in abstract, multiple images, open questions Signed-off-by: ZIV NEVO <[email protected]> * From Mariusz: additional usecase and workloads contain their own identity Signed-off-by: ZIV NEVO <[email protected]> * more on images Signed-off-by: ZIV NEVO <[email protected]> --------- Signed-off-by: Etai Lev Ran <[email protected]> Signed-off-by: ZIV NEVO <[email protected]> Co-authored-by: ZIV NEVO <[email protected]> Co-authored-by: Ziv Nevo <[email protected]>
clusterlink-net · Jan 28, 2024 · 6f7f51b · 6f7f51b
1 parent d0e5a3e
commit 6f7f51b
Showing 1 changed file with 303 additions and 0 deletions.
diff --git a/design-proposals/policy-attributes.md b/design-proposals/policy-attributes.md
@@ -0,0 +1,303 @@
+# Design Proposal: Policy Attributes
+
+**Authors**: @zivnevo, @elevran
+
+**Begin Design Discussion**: 2023-11-20
+
+**Status:** draft
+
+## Summary/Abstract
+
+ClusterLink policies apply to communications between clients and services.
+ Both types of workloads can be identified by a strong (e.g., cryptographic) identity.
+ The identity links the workload to a set of attributes, and policies are
+ defined on workload attributes. This design proposal defines the initial
+ set of attributes used in policies.
+
+## Background
+
+### Motivation and problem space
+
+ClusterLink exchanges workload attributes when determining policies governing communications.
+ Policies affect different communication aspects including, for example, access control and
+ load balancing. ClusterLink gateways serve as enforcement points for egress and ingress traffic.
+ The set of attributes is ill defined. We would like to define the set of attributes used
+ in exchange and policies.
+
+The set of attributes applicable to a communication flow is either determined by the control
+ plane at runtime or derived from the workload's identity document. We can associate two measures
+ with each attribute:
+
+- **trustworthiness**, relating the level of trust we can place in its derivation (e.g., permission-level,
+ complexity and skill required in affecting the attribute's value). Ideally, policies make
+ judicious use of attributes based on the level of trust and sensitivity of the communicating workloads.
+- **usefulness**, relating to the amount of unique context provided by the attribute.
+ For example, attributes that set the workload's application tier are far more useful than
+ arbitrarily-set attributes such as process id or creation timestamp.
+
+### Impact and desired outcome
+
+The current set of policy attributes is incomplete and not well defined.
+ This leaves the implementation to make decisions that are not fully transparent to users.
+ Defining the (initial) set of attributes used in policies, would allow ClusterLink users to
+ make informed and stable decisions about policy definition as suited for their use case and
+ requirements.
+
+### Prior discussion and links
+
+Not applicable.
+
+## User/User Story
+
+- **Access control based on cluster geography**: As a network administrator I would like to enable
+ Service access only from certain locations (e.g., EU only, to comply with GDPR).
+- **Load balancing based on cluster geography**: As a network administrator I would like to set
+ a policy that, when a Service is provided by several remote locations, only locations with the
+ same geography as the source should be considered.
+- **Access control based on cluster identity**: As a Service owner, I would like to allow
+ access to a specific service only from other clusters I own.
+- **Access control based on workload namespace and labels**: as a Service owner, I would
+ like to enable access to a service based on the source workload namespace and its "role" label
+ value, regardless of cluster where the workload is running (e.g., assumes clusters are used as the
+ infrastructure and teams are allocated the same namespace across all clusters). Consequently, I would like
+ to enforce an egress policy that allows only workloads from *namespaces I own* on remote clusters and
+ not from namespaces assigned to other users. The labels of workloads running in other namespaces are
+ not trusted.
+- **Access control based on workload verified/validated version**:  As a Service owner, I would like to allow access to a specific service only for validated and trusted workloads. This might be an extension or working in conjunction with any or all of the above use-cases. Only specific version of the image (validated by image tag or even SHA) is allowed to access this service.
+
+## Goals
+
+This design document should:
+
+- Define the (initial) set of attributes available for policy definition and enforcement. The set
+ may be extended in the future.
+- Define the source of each attribute (i.e., where retrieved), along with some assessment of its
+ trustworthiness and usefulness.
+- Define how attributes are encoded in policy definition and exchanged in gateway communications
+ to enable policy enforcement.
+
+## Non-Goals
+
+The following aspects are explicitly excluded and are out of scope in the current design:
+
+- Defining policy attributes and their facets in environments other than Kubernetes.
+- Define the life-cycle management of the attribute set (i.e., how attributes are added, deprecated
+ and modified in a backward compatible manner).
+- Define the process of formal and provable attestation of attributes and their values. This topic
+ is partially addressed by assigning different trustworthiness measures to different attributes.
+
+## Proposal
+
+Every connection has a source and a destination. While the source is a specific workload instance
+ (e.g., a Pod), the destination is a Service (i.e., collection of instances). Kubernetes does not
+ have an equivalent grouping concept for "clients" as it does for "servers", thus, we assign
+ and process attributes at the workload (specific client instance) and the Service (a collection
+ of potential destinations - actual instance selection is left to Kubernetes mechanism and out of
+ scope). Note that, in the future, we may leverage Kubernetes constructs such as `ReplicaSet` or
+ `Deployment` as a convenience grouping mechanism, though this will not replace the attribute
+ set defined herein.
+
+We propose to have attributes defined at different scope/layer, with each object implicitly assigned
+ attributes of containing layer:
+
+- Site level attributes (either a fixed set defined by ClusterLink or extended by user according to
+ the Fabric configuration [1]) that are pertinent to all workloads and Services in the site. Examples
+ may include `geography`, `cloud-provider`, `cloud-region`, `cluster-name`, etc.
+- Service level attributes (either a fixed set or augmented by user in the fabric configuration).
+ These may includes such attributes as `service-name`, `namespace`, `labels`, etc. Other attributes may
+ be derived from the Kubernetes Service definition, if relevant. Services are assigned the Site
+ attributes as well.
+- Workload attributes are associated with a specific workload instance, and may include, for example,
+ `service-account`, `namespace`, `image-name`, etc. Workloads are assigned the Site attributes as well.
+
+[1] Fabric level configuration could be used to define the set of attributes that can be defined per Site.
+ The concept of a fabric defines a "container" for sites that can potentially communicate with each other.
+ The fabric defines the root of trust as well as any global configuration.
+
+### General Properties of Attributes
+
+- All attributes are key-value pairs. Keys are unique within a set (i.e., can't appear more than once).
+- Attributes are scoped. Scope is set in the key prefix (e.g., "cl-site:geo", "k8s:ns:, not "geo","ns").
+ This potentially enables future extension to other environments without having to overload concepts.
+- Attributes are not typed - the value in the key-value pair is always a string. This enables the use
+ of match expressions (e.g., *is*, *is not*, *is one of*, etc.).
+- Attribute trustworthiness varies. The user / policy writer is ultimately responsible for deciding
+ what attributes are relevant in a policy.
+
+### Workload Attributes
+
+If we assume the following are true:
+
+- Replies from Kubernetes API server can be trusted;
+- authentication/authorization is correctly configured on the Kubernetes API server; and
+- users are isolated in their own namespaces
+
+then the following attributes can be used to identify a workload (K8s Pod) within a Site:
+
+- K8s namespace
+- Other metadata fields, including
+  - Pod labels
+  - Pod name
+  - Owner reference
+- Pod Spec fields, including
+  - Service Account
+  - Image name and SHA/tag (if multiple: concatenate, sort and base64 encode)
+  - Init image name and SHA/tag (if multiple: concatenate, sort and base64 encode)
+
+As users are isolated in their own namespaces, it is not possible for an attacker to provision
+ resources in arbitrary namespaces and impersonate another workload. Labels, then, can be used to
+ differentiate between the different workloads within the namespace. Assuming they are configured
+ correctly by the workload owner, this should be sufficient to uniquely specify workloads safely.
+ However, functional attributes, such as image name or its Service Account, might be handy as well.
+
+### Service Attributes
+
+Service attributes are set (or retrieved) when a Service is exported. Remote gateways become aware
+ of the Service attributes when a service is first imported. If multiple bindings exist for an Import,
+ all bound Services must have fully matching attribute set. A binding is declined when there is a
+ mismatch between a first and later binding. Ideally, the management layer will ensure all gateways
+ importing the same service, will see an identical set of attributes. This also favors that Services
+ and Service attributes are set by the user in a central place and get distributed via management layer.
+ The exact definition is out of scope of this design.
+
+### Gateway Attributes
+
+Gateways learn the attributes associated with other gateways when Peers are added.
+
+### Attribute Table
+
+| Attribute name | Scope | Source | Description | Comments |
+| ---- | ----- | ------ | ----------- | -------- |
+| `cl:fabric` |  Site/Fabric | configuration | fabric the site belongs to | Implicit via CA, might be useful in future for cross fabric communication |
+| `site:name` | Site | configuration | site name | Configured when site is created |
+| `site:location` | Site | configuration | site location | hierarchical (e.g., `aws/us-east/vpc17`) or split to flat attributes (e.g., `site:provider`, `site:region` - similar to `site:name`) |
+| `site:environment` | Site | configuration | site environment (e.g., production, staging) | mandated or recommended? |
+| `cl:site:<attr>` | Site | configuration | user defined site attributes | do we want to support these initially? |
+| `cl:service:<attr>` | Service | configuration | user defined Service attributes | do we want to support these initially? |
+| `service:name` | Service | k8s API (or Export/Import?) | Service name | is there a corresponding workload name? For workloads name are randomized, but the name of the "owner object" might be useful? Is there a service namespace?|
+| `k8s:ns` | Workload, Service | k8s API | Kubernetes namespace | |
+| `k8s:sa` | Workload | k8s API | Kubernetes service account name | |
+| `k8s:label:<name>` | Workload, Service | k8s API | Kubernetes label(s) | the use of [common k8s labels](https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/) is recommended. Labels describing the application structure (e.g., `app`, `role`, `tier`) could be expressive and flexible |
+| `k8s:container-image` | Workload | k8s API | Image name | includes repo, name, SHA/Tag. Multiple images are concatenated. |
+| `k8s:init-container-image` | Workload | k8s API | Image name | includes repo, name, SHA/Tag. Multiple images are concatenated. |
+
+### Exchanging Attributes Between Gateways
+
+For simplicity, let's assume that all gateways keep the attributes of all other gateways.
+Moreover, all gateways keep the attributes of all imported/exported services.
+While this assumption allows optimizing bandwidth and simplifies the below description, it is not
+strictly required. For this assumption to hold, management layer will probably need a mechanism
+to allow updating the attributes of gateways/services across the fabric.
+
+**Client Side:**
+
+1. The local gateway data plane gets a request from a local workload (client) to connect to a
+ remote service. The client handle (currently its IP address) and destination service are passed
+ to the control plane.
+1. The control plane extracts workload attributes from the cluster's API server. The client's IP address
+ is used as a handle to identify the workload.
+1. The control plane merges these attributes with its own (gateway attributes) to form the set of
+ source attributes.
+1. The control plane forms a collection of destination attribute sets, one set per remote-service binding.
+ Each set of destination attributes contains both the attributes of the remote service and the attributes
+ of the remote gateway exposing this service.
+1. The control plane can now call the policy engine component with the set of source attributes
+ and with the collection of sets of destination attributes.
+1. The access-policy engine will filter down to the set of remote gateways that are allowed to provide the
+ service (if any) based on access control policies set. The load-balancing-policy engine will choose one
+ remote gateway out of this set based on the load balancing policies defined.
+1. The selected destination will be returned to the data plane (potentially along with other configuration
+ if needed), which can then initiate a connection request to the remote gateway.
+
+**Server Side:**
+
+1. The gateway on the cluster of the exported service gets a connection request from the client-side
+ gateway. The connection request includes the attributes of the requesting workload.
+1. The server-side gateway merges these attributes with the attributes of the client-side gateway
+ to form the set of source attributes (note that the source site attributes are not sent to conserve
+ resources - see note [here](#exchanging-attributes-between-gateways)).
+1. The server-side gateway then merges the attributes of the requested service with its own set of gateway
+ attributes to form the set of destination attributes.
+1. It can now call the policy engine with the two sets of attributes and get an allow/deny answer.
+
+## Impacts / Key Questions
+
+<!-- List crucial impacts and key questions, some of which may still be open. They likely
+ require discussion and are required to understand the trade-offs of the design. During
+ the lifecycle of a design proposal, discussion on design aspects can be moved into this
+ section. After reading through this section, it should be possible to understand any
+ potentially negative or controversial impact of the design. It should also be possible
+ to derive the key design questions: X vs Y. -->
+
+<!-- This will also help people understand the caveats to the proposal, other important
+ details that didn't come across above, and alternatives that could be considered. It can
+ also be a good place to talk about core concepts and how they relate. It can be helpful
+ to explicitly list the pros and cons of each decision. Later, this information can be
+ reused to update project documentation, guides, and Frequently Asked Questions (FAQs).
+-->
+
+- How safe is relying on the requesting workload's IP to obtain its attributes from the K8s API server?
+  Which attacks does this expose us to? Can workloads contain their own identity instead (possibly via a sidecar) and present these tokens/certs to the gateway in order to get access to specific resource?
+- What is the process of establishing the Gateway attributes? Are these attributes encoded in the Gateway certificate?
+
+## Future Milestones
+
+The design will enable the following which are out of scope for now:
+
+- Support for additional attribute sources in the future
+- Additional for additional attributes
+- Adding and enforcing the setting of user defined attributes for services and sites
+
+## Non Functional
+
+### Testing Plan
+
+TODO
+
+### Update/Rollback Compatibility
+
+We don't support backward compatibility. All policies and implementations must be updated to the
+ adhere to the specification defined by this design.
+
+### Scalability
+
+TODO: not applicable.
+
+### Security Considerations
+
+The introduction of ClusterLink gateways to a cluster, increases the 'surface area' exposed
+ for attack, by allowing remote access to Services.
+
+The following security considerations are impacted (though not necessarily directly by this design
+ change which is more concerned with formalizing existing implementation):
+
+- ClusterLink gateways are configured to establish mutually authenticated connections only with
+ other gateways in the same Fabric (trust domain, certificate authority). This should limit
+ some of the exposure.
+- ClusterLink requires elevated permissions to read Pod and Service specification and status
+ across multiple namespaces.
+- The "trustworthiness" of attributes is paramount for effective policy enforcement, in articular for
+ access control. The trust depends on (1) secure access to the API masters; and (2) effective
+ segregation and confinement of users to specific namespaces. Both of these assumptions are
+ reasonable and expected under normal cluster operation and management.
+- Similarly, the correctness of the policy engine impacts the operation and cross site
+ communication.
+- Users may opt-out of ClusterLink access by (1) not importing/exporting Services; (2) ensuring
+ strict, default deny, policies are defined; and (3) potentially further locking down access by
+ setting appropriate k8s NetworkPolicies on their sensitive Pods, disallowing access from the
+ clusterLink namespace.
+- The use of client IP address as the client handle used in retrieving attributes can be subject
+ to impersonation/spoofing in certain cases.
+
+### Implementation Phases/History
+
+<!-- Describe the development and implementation phases planned to break up the work and/or
+ record them here as they occur. Provide enough detail so readers may track the major
+ milestones in the lifecycle of the design proposal and correlate them with issues, PRs,
+ and releases occurring within the project. -->
+
+TODO
+
+- gateway attr (encoded in cert?)
+- workload attr, collected by control plane
+- service attr, defined by user, carried over on import