diff --git a/docusaurus/docs/reference/30-configuration/_category_.yml b/docusaurus/docs/reference/30-configuration/_category_.yml index 8407d807..09ae9f0e 100644 --- a/docusaurus/docs/reference/30-configuration/_category_.yml +++ b/docusaurus/docs/reference/30-configuration/_category_.yml @@ -1,5 +1,5 @@ label: Configuration -position: 40 +position: 15 link: type: doc id: reference/configuration/conventions diff --git a/docusaurus/docs/reference/30-configuration/controller.md b/docusaurus/docs/reference/30-configuration/controller.md index 92f194ca..beb31909 100644 --- a/docusaurus/docs/reference/30-configuration/controller.md +++ b/docusaurus/docs/reference/30-configuration/controller.md @@ -477,7 +477,7 @@ profile: The raft section enables running multiple controllers in a cluster. -- `bootstrapMembers` - (optional) Only used when bootstrapping the cluster. List of initial clusters +- `initialMembers` - (optional) Only used when bootstrapping the cluster. List of initial clusters members. Should only be set on one of the controllers in the cluster. - `commandHandler` - (optional) - `maxQueueSize` - (optional, 1000) max size of the queue for processing incoming raft log @@ -510,10 +510,12 @@ The raft section enables running multiple controllers in a cluster. be used to bring other nodes up to date that are only slightly behind, without having to send the full snapshot. This is a cluster wide value and should be consistent across nodes in the cluster. Otherwise the value from the most recently started controller will win. +- `warnWhenLeaderlessFor` - (optional, 1m) - Emits a warning log message if a controller is part of + a cluster with no leader for a duration which exceeds this threshold. ```text raft: - bootstrapMembers: + initialMembers: - tls:127.0.0.1:6262 - tls:127.0.0.1:6363 - tls:127.0.0.1:6464 diff --git a/docusaurus/docs/reference/30-configuration/router.md b/docusaurus/docs/reference/30-configuration/router.md index ffb39f35..cb2b40d2 100644 --- a/docusaurus/docs/reference/30-configuration/router.md +++ b/docusaurus/docs/reference/30-configuration/router.md @@ -122,6 +122,8 @@ The `ctrl` section configures how the router will connect to the controller. See [heartbeats](./conventions.md#heartbeats). - `options` - a set of option which includes the below options and those defined in [channel options](conventions.md#channel) +- `endpointsFile` - (optional, 'config file dir'/endpoints) - File location to save the current + known set of controller endpoints, when an endpoints update has been received from a controller. Example: @@ -164,6 +166,9 @@ Each dialer currently supports a number of [shared options](conventions.md#xgres The `edge` section contains configuration that pertain to edge functionality. This section must be present to enable edge functionality (e.g. listening for edge SDK connections, tunnel binding modes). +- `db` - (optional, `.proto.gzip`) - Configures where the router data model will be snapshotted to +- `dbSaveIntervalSeconds` - (optional, 30s) - Configures how the router data model will be snapshotted + Example: ```text @@ -210,7 +215,6 @@ routers at least one valid SAN must be provided. - `uri` - (optional) - an array of URI SAN entries - `email` - (optional) - an array of email SAN entries - ### `forwarder` The `forwarder` section controls options that affect how a router forwards payloads across links to diff --git a/docusaurus/docs/reference/_category_.yml b/docusaurus/docs/reference/_category_.yml index 5904cd26..9e7a022f 100644 --- a/docusaurus/docs/reference/_category_.yml +++ b/docusaurus/docs/reference/_category_.yml @@ -1,2 +1,2 @@ label: Reference -position: 40 +position: 10 diff --git a/docusaurus/docs/reference/config-types/index.md b/docusaurus/docs/reference/config-types/index.md index 70767b76..1b9c2bea 100644 --- a/docusaurus/docs/reference/config-types/index.md +++ b/docusaurus/docs/reference/config-types/index.md @@ -1,6 +1,6 @@ --- title: Builtin Config Types -sidebar_position: 10 +sidebar_position: 20 --- ## Overview diff --git a/docusaurus/docs/reference/ha/_category_.yml b/docusaurus/docs/reference/ha/_category_.yml new file mode 100644 index 00000000..01a83d0e --- /dev/null +++ b/docusaurus/docs/reference/ha/_category_.yml @@ -0,0 +1,5 @@ +label: Controller HA +position: 22 +link: + type: doc + id: reference/ha/overview diff --git a/docusaurus/docs/reference/ha/bootstrapping.md b/docusaurus/docs/reference/ha/bootstrapping.md new file mode 100644 index 00000000..4c163e68 --- /dev/null +++ b/docusaurus/docs/reference/ha/bootstrapping.md @@ -0,0 +1,164 @@ +--- +sidebar_label: Bootstrapping +sidebar_position: 10 +--- + +# Bootstrapping A Cluster + +To bring up a controller cluster, one starts with a single node. + +## Controller Configuration + +### Certificates + +Each controller requires appropriate certificates. The certificates for clustered controllers +have more requirements than those for a standalone server. See the [Certificates Reference](./certificates.md) +for more information. + +### Config File + +The controller requires a `raft` section. + +```yaml +raft: + dataDir: /path/to/data/dir +``` +The `dataDir` will be used to store the following: + +* `ctrl-ha.db` - the OpenZiti data model bbolt database +* `raft.db` - the raft bbolt database +* `snapshots/` - a directory to store raft snapshots + +Controller use the control channel listener to communicate with each other. Unlike +routers, they need to know how to reach each other, so an advertise address must +be configured. + +```yaml +ctrl: + listener: tls:0.0.0.0:6262 + options: + advertiseAddress: tls:192.168.1.100:6262 +``` + +Finally, for sessions to work across controllers, JWTs are used. To enable these +an OIDC endpoint should be configured. + +```yaml +web: + - name: all-apis-localhost + bindPoints: + - interface: 127.0.0.1:1280 + address: 127.0.0.1:1280 + options: + minTLSVersion: TLS1.2 + maxTLSVersion: TLS1.3 + apis: + - binding: health-checks + - binding: fabric + - binding: edge-management + - binding: edge-client + - binding: edge-oidc +``` + +## Initializing the Controller + +Once properly configured, the controller can be started. + +```shell +ziti controller run ctrl1.yml +``` + +Once the controller is up and running, it will see that it is not yet initialized, and will pause +startup, waiting for initialization. While waiting it will periodically emit a message: + +``` +[ 3.323] WARNING ziti/controller/server.(*Controller).checkEdgeInitialized: the +Ziti Edge has not been initialized, no default admin exists. Add this node to a +cluster using 'ziti agent cluster add tls:localhost:6262' against an existing +cluster member, or if this is the bootstrap node, run 'ziti agent controller init' +to configure the default admin and bootstrap the cluster +``` + +As this is the first node in the cluster, we can't add any nodes to it yet. Instead, run: + +``` +ziti agent controller init +``` + +This initializes an admin user that can be used to manage the network. + +## Managing the Cluster + +There are four commands which can be used to manage the cluster. + +```bash +# Adding Members +ziti agent cluster add + +# Listing Members +ziti agent cluster list + +# Removing Members +ziti agent cluster remove + +# Transfer Leadership +ziti agent cluster transfer-leadership [new leader id] +``` + +These are also available via the REST API, and can be invoked through the CLI. + +```bash +$ ziti ops cluster --help +Controller cluster operations + +Usage: + ziti ops cluster [flags] + ziti ops cluster [command] + +Available Commands: + add-member add cluster member + list-members list cluster members and their status + remove-member remove cluster member + transfer-leadership transfer cluster leadership to another member + +Flags: + -h, --help help for cluster + +Use "ziti ops cluster [command] --help" for more information about a command. +``` + +## Growing the Cluster + +Once a single node is up and running, additional nodes can be added to it. They should be +configured the same as the initial node, though they will have different addresses. + +The first node, as configured above, is running at `192.168.1.100:6262`. + +If the second node is running at `192.168.1.101:6262`, then it can be added to the +cluster in one of two ways. + +### From An Existing Node + +From a node already in the cluster, in this case our initial node, we can add the +new node as follows: + +```bash +user@node1$ ziti agent cluster add tls:192.168.3.101 +``` + +### From A New Node + +We can also ask the new node, which is not yet part of the cluster, to reach +out to an existing cluster node and request to be joined. + +``` +user@node2$ ziti agent cluser add tls:192.168.3.100 +``` + +## Shrinking the Cluster + +From any node in the cluster, nodes can be removed as follows: + +``` +user@node1$ ziti agent cluster remove tls:192.168.3.101 +``` diff --git a/docusaurus/docs/reference/ha/certificates.md b/docusaurus/docs/reference/ha/certificates.md new file mode 100644 index 00000000..96758531 --- /dev/null +++ b/docusaurus/docs/reference/ha/certificates.md @@ -0,0 +1,86 @@ +--- +sidebar_label: Certificates +sidebar_position: 20 +--- + +# Controller Certificates + +For controllers to communicate and trust one another, they need certificates that have +been generated with the correct attribute and relationships. + +## Requirements + +1. The certificates must have a shared root of trust +2. The controller client and server certificates must contain a + [SPIFFE ID](https://spiffe.io/docs/latest/spiffe-about/spiffe-concepts/#spiffe-id) + +## Steps to Certificate Creation +There are many ways to set up certificates, so this will just cover a recommended configuration. + +The primary thing to ensure is that controllers have a shared root of trust. +A standard way of generating certs would be as follows: + +1. Create a self-signed root CA +1. Create an intermediate signing cert for each controller +1. Create a server cert using the signing cert for each controller +1. Create a client cert using the signing cert for each controller +1. Make sure that the CA bundle for each server includes both the root CA and the intermediate CA + for that server + +Note that controller server certs must contain a SPIFFE id of the form + +``` +spiffe:///controller/ +``` + +So if your trust domain is `example.com` and your controller id is `ctrl1`, then your SPIFFE id +would be: + +``` +spiffe://example.com/controller/ctrl1 +``` + +**SPIFFE ID Notes:** + +* This ID must be set as the only URI in the `X509v3 Subject Alternative Name` field in the + certificate. +* These IDs are used to allow the controllers to identify each during the mTLS negotiation. +* The OpenZiti CLI supports creating SPIFFE IDs in your certs + * Use the `--trust-domain` flag when creating CAs + * Use the `--spiffe-id` flag when creating server or client certificates + +## Example + +Using the OpenZiti PKI tool, certificates for a three node cluster could be created as follows: + +```bash +# Create the trust root, a self-signed CA +ziti pki create ca --trust-domain ha.test --pki-root ./pki --ca-file ca --ca-name 'HA Example Trust Root' + +# Create the controller 1 intermediate/signing cert +ziti pki create intermediate --pki-root ./pki --ca-name ca --intermediate-file ctrl1 --intermediate-name 'Controller One Signing Cert' + +# Create the controller 1 server cert +ziti pki create server --pki-root ./pki --ca-name ctrl1 --dns localhost --ip 192.168.3.100 --server-name ctrl1 --spiffe-id 'controller/ctrl1' + +# Create the controller 1 server cert +ziti pki create client --pki-root ./pki --ca-name ctrl1 --client-name ctrl1 --spiffe-id 'controller/ctrl1' + +# Create the controller 2 intermediate/signing cert +ziti pki create intermediate --pki-root ./pki --ca-name ca --intermediate-file ctrl2 --intermediate-name 'Controller Two Signing Cert' + +# Create the controller 2 server cert +ziti pki create server --pki-root ./pki --ca-name ctrl2 --dns localhost --ip 192.168.3.101 --server-name ctrl2 --spiffe-id 'controller/ctrl2' + +# Create the controller 2 client cert +ziti pki create client --pki-root ./pki --ca-name ctrl2 --client-name ctrl2 --spiffe-id 'controller/ctrl2' + +# Create the controller 3 intermediate/signing cert +ziti pki create intermediate --pki-root ./pki --ca-name ca --intermediate-file ctrl3 --intermediate-name 'Controller Three Signing Cert' + +# Create the controller 3 server cert +ziti pki create server --pki-root ./pki --ca-name ctrl3 --dns localhost --ip 192.168.3.102 --server-name ctrl3 --spiffe-id 'controller/ctrl3' + +# Create the controller 3 client cert +ziti pki create client --pki-root ./pki --ca-name ctrl3 --client-name ctrl3 --spiffe-id 'controller/ctrl3' +``` diff --git a/docusaurus/docs/reference/ha/data-model.md b/docusaurus/docs/reference/ha/data-model.md new file mode 100644 index 00000000..fa961c60 --- /dev/null +++ b/docusaurus/docs/reference/ha/data-model.md @@ -0,0 +1,135 @@ +--- +sidebar_label: Data Model +sidebar_position: 80 +--- + +# Controller HA Data Model + +:::info + +This document is likely most interesting for developers working on OpenZiti, +those curious about how distributed systems work in general, or curious +about how data is distributed in OpenZiti. + +::: + +## Model Data + +### Model Data Characteristics + +* All data required on every controller +* Read characteristics + * Reads happen all the time, from every client and as well as admins + * Speed is very important. They affect how every client perceives the system. + * Availability is very important. Without reading definitions, can’t create new connections + * Can be against stale data, if we get consistency within a reasonable timeframe (seconds to + minutes) +* Write characteristics + * Writes only happen from administrators + * Speed needs to be reasonable, but doesn't need to be blazing fast + * Write availability can be interrupted, since it primarily affects management operations + * Must be consistent. Write validation can’t happen with stale data. Don’t want to have to deal + with reconciling concurrent, contradictory write operations. +* Generally involves controller to controller coordination + +Of the distribution mechanisms we looked at, RAFT had the best fit. + +### Raft Resources + +For a more in-depth look at Raft, see + +* https://raft.github.io/ +* http://thesecretlivesofdata.com/raft/ + +### RAFT Characteristics + +* Writes + * Consistency over availability + * Good but not stellar performance +* Reads + * Every node has full state + * Local state is always internally consistent, but maybe slightly behind the leader + * No coordination required for reads + * Fast reads + * Reads work even when other nodes are unavailable + * If latest data is desired, reads can be forwarded to the current leader + +So the OpenZiti controller uses RAFT to distribute the data model. Specifically it uses the +[HashiCorp Raft Library](https://github.com/hashicorp/raft/). + +### Updates + +The basic flow for model updates is as follows: + +1. A client requests a model update via the REST API. +2. The controller checks if it is the raft cluster leader. If it is not, it forwards the request to + the leader. +3. Once the request is on the leader, it applies the model update to the raft log. This involves + getting a quorum of the controllers to accept the update. +4. One the update has been accepted, it will be executed on each node of the cluster. This will + generate create one or more changes to the bolt database. +5. The results of the operation (success or failure) are returned to the controller which received + the original REST request. +6. The controller waits until the operation has been applied locally. +7. The result is returned to the REST client. + +### Reads + +Reads are always done to the local bolt database for performance. The assumption is that if +something like a policy change is delayed, it may temporarily allow a circuit to be created, but as +soon as the policy update is applied, it will make changes to circuits as necessary. + +## Runtime Data + +In addition to model data, the controller also manages some amount of runtime data. This data is for +running OpenZiti's core functions, i.e. managing the flow of data across the mesh, along with +related authentication data. So this includes things like: + +* Links +* Circuits +* API Sessions +* Sessions +* Posture Data + +### Runtime Data Characteristics + +Runtime data has different characteristics than the model data does. + +* Not necessarily shared across controllers +* Reads **and** writes must be very fast +* Generally involves sdk to controller or controller to router coordination + +Because writes must also be fast, RAFT is not a good candidate for storing this data. Good +performance is critical for these components, so they are each evaluated individually. + +### Links + +Each controller currently needs to know about links so that it can make routing decisions. However, +links exist on routers. So, routers are the source of record for links. When a router connects to a +controller, the router will tell the controller about any links that it already has. The controller +will ask to fill in any missing links and the controller will ensure that it doesn't create +duplicate links if multiple controllers request the same link be created. If there are duplicates, +the router will inform the controller of the existing link. + +The allows the routers to properly handle link dials from multiple routers and keep controllers up +to date with the current known links. + +### Circuits + +Circuits were and continue to be stored in memory for both standalone and HA mode +controllers.Circuits are not distributed. Rather, each controller remains responsible for any +circuits that it created. + +When a router needs to initiate circuit creation it will pick the one with the lowest response time +and send a circuit creation request to that router. The controller will establish a route. Route +tables as well as the xgress endpoints now track which controller is responsible for the associated +circuit. This way when failures or other notifications need to be sent, the router knows which +controller to talk to. + +This gets routing working with multiple controllers without a major refactor. Future work will +likely delegate more routing control to the routers, so routing should get more robust and +distributed over time. + +### Api Sessions, Sessions, Posture Data + +API Sessions and Sessions are moving to bearer tokens. Posture Data is now handled in the routers. diff --git a/docusaurus/docs/reference/ha/migrating.md b/docusaurus/docs/reference/ha/migrating.md new file mode 100644 index 00000000..c4190b27 --- /dev/null +++ b/docusaurus/docs/reference/ha/migrating.md @@ -0,0 +1,45 @@ +--- +sidebar_label: Migrating +sidebar_position: 30 +--- + +# Migrating Controllers + +A controller can be moved from standalone mode to HA mode. It can also be returned +from HA mode back to standalone mode. + +## Standalone to HA + +### Requirements +First, ensure that the controller's certificates and configuration meet the requirements +in [Bootstrapping](./bootstrapping.md). + +### Data Model Migration +The controller's data can be imported in one of two ways: + +**Using Config** + +Leave the `db: ` setting in the controller config. When the controller +starts up, it will see that it's running in HA mode, but isn't initialized yet. It will +try to use the database in the configuration to initialize its data model. + +**Using the Agent** + +The agent can also be used to provide the controller database to the controller. + +``` +ziti agent controller init-from-db path/to/source.db +``` + +Once the controller is initialized, it should start up as normal and be usable. +The cluster can now be expanded as explained in [Bootstrapping](./bootstrapping.md). + +## HA to Standalone + +This assumes that you have a database snapshot from an HA cluster. This could either +be the ctrl-ha.db from the `dataDir`, or a snapshot created using the snapshot +CLI command. + +To revert back to standalone mode, the `raft` section would be removed from the +config file and the `db:` section would be added back, pointing at the snapshot +from the HA cluster. Now when started, it should come up in standalone mode. diff --git a/docusaurus/docs/reference/ha/operations.md b/docusaurus/docs/reference/ha/operations.md new file mode 100644 index 00000000..86b5c6e8 --- /dev/null +++ b/docusaurus/docs/reference/ha/operations.md @@ -0,0 +1,96 @@ +--- +sidebar_label: Operations +sidebar_position: 50 +--- + +# Operating a Controller Cluster + +## Restoring from Backup + +To restore from a database snapshot, use the following CLI command: + +``` +ziti agent controller restore-from-db /path/to/backup.db +``` + +As this is an agent command, it must be run on the same machine as the controller. The path +provided will be read by the controller process, not the CLI. + +The controller will apply the snapshot and then terminate. All controllers in the cluster will +terminate and expect to be restarted. This is so in memory caches won't be out of sync with +the database which has changed. + +## Snapshot Application and Restarts + +If a controller is out of communcation for a while, it may receive a snapshot to apply, rather +than a stream of events. + +If a controller receives a snapshot to apply after starting up, it will apply the snapshot and then +terminate. This assumes that there is a restart script which will bring the controller back up after +it terminates. + +This should only happen if a controller is connected to the cluster and then gets disconnected for +long enough that a snapshot is created while it's disconnected. Because applying a snapshot requires +replacing the underlying controller bolt DB, the easiest way to do that is restart. That way we +don't have to worry about replacing the bolt DB underneath a running system. + +## Events + +All events now contain a `event_src_id` to indicate which controller emitted them. + +There are some new events which are specific to clusters. See [Cluster Events](../events#cluster) +for more detail. + +## Metrics + +In an HA system, routers will send metrics to all controllers to which they are connected. There is +a new `doNotPropagate` flag in the metrics message, which will be set to false until the router has +successfully delivered the metrics message to a controller. The flag will then be set to true. So +the first controller to get the metrics message is expected to deliver the metrics message to the +events system for external integrators. The other controllers will have `doNotPropage` set to true, +and will only use the metrics message internally, to update routing data. + +## Open Ports + +Controllers now establish connections with each other, for two purposes. + +1. Forwarding model updates to the leader, so they can be applied to the raft cluster +2. raft communication + +Both kinds of traffic flow over the same connection. + +These connections do not require any extra open ports as we are using the control channel listener +to listen to both router and controller connections. As part of the connection process the +connection type is provided and the appropriate authentication and connection setup happens based on +the connection type. If no connection type is provided, it's assumed to be a router. + +## System of Record + +In controller that's not configured for HA, the bolt database is the system of record. In an HA +setup, the raft journal is the system of record. The raft journal is stored in two places, a +snapshot directory and a bolt database of raft journal entries. + +So a non-HA setup will have: + +* ctrl.db + +An HA setup will have: + +* raft.db - the bolt database containing raft journal entries +* snapshots/ - a directory containing raft snapshots. Each snapshot is snapshot of the controller + bolt db +* ctrl.db - the controller bolt db, with the current state of the model + +The location of all three is controlled by the raft/dataDir config property. + +```yaml +raft: + dataDir: /var/ziti/data/ +``` + +When an HA controller starts up, it will first apply the newest snapshot, then any newer journal +entries that aren't yet contained in a snapshot. This means that an HA controller should start with +a blank DB that can be overwritten by snapshot and/or have journal entries applied to it. So an HA +controller will delete or rename the existing controller database and start with a fresh bolt db. + + diff --git a/docusaurus/docs/reference/ha/overview.md b/docusaurus/docs/reference/ha/overview.md new file mode 100644 index 00000000..c386d098 --- /dev/null +++ b/docusaurus/docs/reference/ha/overview.md @@ -0,0 +1,55 @@ +--- +sidebar_label: Overview +sidebar_position: 05 +--- + +# Controller HA + +## Overview + +OpenZiti controllers can be run in a cluster for high availablity and performance scaling. + +:::warning + +**NOTE: Controller HA is still in Beta** + +It's quite functional now, but we are continuing to test and refine before we mark it GA. +::: + + +### For SDK Clients/Tunnelers + +A controller cluster offers the following advantages: + +1. Horizontal scaling of SDK client services such as + 1. Service lookups + 1. Session creation +1. Horizontal scaling of circuit creation + +This means that for everything that SDK clients and tunnelers depend on, controllers +can be scaled up and placed strategically to meet user demand. + +The following limitations currently apply: + +1. Circuits are owned by a controller. If the controller goes down, the circuit + will remain up, but can't be re-routed for performance or if a router goes down. +2. For a controller to route circuits on a router, that router must be connected + to that controller. This means that routers should generally be connected to + all controllers. + +### For Management Operations + +The HA controller cluster uses a distributed journal keep the data model synchronized across controllers. +This has the following ramifications: + +1. Read operations will work on any controller that is up. If the controller is + disconnected from the cluster, the reads may return data that is out of date. +2. Update operations require that the cluster has a leader and that a quorum of nodes + is available. A quorum for a cluster of size N is (N/2)+1. This means that a 3 node + cluster can operate with 2 nodes and a 5 node cluster can operate with 3 nodes, and + so on. +3. Updates can be initiated on any controller, they will be forwarded to the leader to + be applied. +4. The cluster may have non-voting members. + +See [topology](./topology.md) and [the data model](./data-model.md) for more information. diff --git a/docusaurus/docs/reference/ha/routers.md b/docusaurus/docs/reference/ha/routers.md new file mode 100644 index 00000000..6127509f --- /dev/null +++ b/docusaurus/docs/reference/ha/routers.md @@ -0,0 +1,62 @@ +--- +sidebar_label: Routers +sidebar_position: 40 +--- + +# Routers in Controller HA + +There are only a few differences in how routers work in an HA cluster. + +## Configuration + +Instead of specifying a single controller, you can specify multiple controllers +in the router configuration. + +```yaml +ctrl: + endpoints: + - tls:192.168.3.100:6262 + - tls:192.168.3.101:6262 + - tls:192.168.3.102:6262 +``` + +If the controller cluster changes, it will notify routers of the updated +controller endpoints. + +By default these will be stored in a file named `endpoints` in the same directory +as the router config file. + +However, the file can be customized using a config file settings. + +```yaml +ctrl: + endpoints: + - tls:192.168.3.100:6262 + endpointsFile: /var/run/ziti/endpoints.yaml +``` + +In general, a router should only need one or two controllers to bootstrap itself, +and thereafter should be able to keep the endpoints list up to date with help +from the controller. + +## Router Data Model + +In order to enable HA functionality, the router now receives a stripped down +version of the controller data model. While required for controller HA, this +also enables other optimizations, so use of the router data model is also enabled +by default when running in standalone mode. + +The router data model can be disabled on the controller using a config setting, +but since it is required for HA, that flag will be ignored if the controllers +are running in a cluster. + +The data model on the router is periodically snapshotted, so it doesn't need to +be fully restored from a controller on every restart. + +The location and frequency of snapshotting can be [configured](../configuration/router#edge). + +## Controller Selection + +When creating circuits, routers will chose the most responsive controller, based on latency. +When doing model updates, such as managing terminators, they will try to talk directly to +the current cluster leader, since updates have to go through the leader in any case. diff --git a/docusaurus/docs/reference/ha/topology.md b/docusaurus/docs/reference/ha/topology.md new file mode 100644 index 00000000..57934a08 --- /dev/null +++ b/docusaurus/docs/reference/ha/topology.md @@ -0,0 +1,81 @@ +--- +sidebar_label: Topology +sidebar_position: 60 +--- + +# Controller Topology + +This document discuss considerations for how many controllers a network might +need and how to place them geographically. + +## Number of Controllers + +### Management + +The first consideration is how many controllers the network should be able to lose without losing +functionality. A cluster of size N needs (N/2) + 1 controllers active and connected to be able +to take model updates, such as provisioning identities, adding/changes services and updating policies. + +Since a two node cluster will lose some functionality if either node becomes unavailable, a minimum +of 3 nodes is recommended. + +### Clients + +The functionality that controllers provide to clients doesn't require any specific number of controllers. +A network manager will want to scale the number controllers based on client demand and may want to +place additional controllers geographically close to clusters of clients for better performance. + +## Voting vs Non-Voting Members + +Because every model update must be approved by a quorum of voting members, adding a large number of voting +members can add a lot of latency to model changes. + +If more controllers are desired to scale out to meet client needs, only as many controllers as are needed +to meet availability requirements for mangement needs should be made into voting members. + +Additionally a having a quorum of controllers be geographically close will reduce latency without necessarily +reducing availability. + +### Example + +**Requirements** + +1. The network should be able to withstand the loss of 1 voting member +1. Controllers should exist in the US, EU and Asia, with 2 in each region. + +To be able to lose one voting member, we need 3 voting nodes, with 6 nodes total. + +We should place 2 voting members in the same region, but in different availability zones/data centers. +The third voting member should be in a different region. The rest of the controllers should be non-voting. + +**Proposed Layout** + +So, using AWS regions, we might have: + +* 1 in us-east-1 (voting) +* 1 in us-west-2 (voting) +* 1 in eu-west-3 (voting) +* 1 in eu-south-1 (non-voting) +* 1 in ap-southeast-4 (non-voting) +* 1 in ap-south-2 (non-voting) + +Assuming the leader is one of us-east-1 or us-west-2, model updates will only need to be accepted by +one relatively close node before being accepted. All other controllers will recieve the updates as well, +but updates won't be gated on communications with all of them. + +**Alternate** + +For even faster updates at the cost of an extra controller, two controllers could be in us-east, one in us-east-1 +and one in us-east-2. The third member could be in the eu. Updates would now only need to be approved by two +very close controllers. If one of them went down, updates would slow down, since updates would need to be done +over longer latencies, but they would still work. + +* 1 in us-east-1 (voting) +* 1 in us-east-2 (voting) +* 1 in us-west-2 (non-voting) +* 1 in eu-west-3 (voting) +* 1 in eu-south-1 (non-voting) +* 1 in ap-southeast-4 (non-voting) +* 1 in ap-south-2 (non-voting) + + diff --git a/docusaurus/docs/reference/tunnelers/_category_.yml b/docusaurus/docs/reference/tunnelers/_category_.yml index ca40b911..54a44c9e 100644 --- a/docusaurus/docs/reference/tunnelers/_category_.yml +++ b/docusaurus/docs/reference/tunnelers/_category_.yml @@ -1,5 +1,5 @@ label: Tunnelers -position: 10 +position: 25 link: type: doc id: reference/tunnelers/index