Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethinking Service Registry, much refactoring #5

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

ess
Copy link
Contributor

@ess ess commented Nov 15, 2022

While the primary focus of this work is to deploy a service
registry as several applications (rather than a single app that
gets scaled out), much work was also done towards making the
code as a whole easier to understand.

Notable Changes

  • The Config and Logger concepts are now broken out as package-
    level quasi-singletons.
  • Introduced the Implementation interface that describes the
    surface area of the ServiceBroker interface that we actually
    need to be implemented. Realistically, anything that is a
    ServiceBroker is an Implementation, but the inverse is not
    true.
  • Reified the config server broker and the service registry
    broker out into their own Implementations.
  • config.Config.Services is no longer an array of Service values.
    It is now a map[string] of Service pointers. This does not
    affect the code previously written around that field, but does
    allow us to effectively have a hard requirement that each
    service be indexed by its implementation name (configserver
    and serviceregistry, respectively). Without this, we can't
    actually generalize code pathing the way that we were trying
    to.
  • Updated all configserver and serviceregistry methods that
    alter the spec they return in place. As these are struct values,
    that's not actually possible. In lieu of temporarily creating
    and dereferencing pointers to the running spec, we instead just
    return a new spec with the details added in.
  • Created a "failsafe" default Implementation that simply returns
    the error case for any of the methods it receives.
  • implementation.Register(topic, Implementation) is used to
    configure the Implementation resolver for no-ask broker method
    dispatching via ...
  • implementation.For(topic) is used to resolve the Implementation
    for a given topic. If no registered Implementation exists for the
    topic, the failsafe Implementation is returned.
  • service-registry nodes have returned to environment variable
    configuration, as we no longer have to avoid restarting nodes.

Notable Issues

  • While this puts the blueprint in place for nodes that can talk
    to each other, there's yet another spanner in the works: the
    self-signed certs make it impossible for the nodes to talk to
    each other. Would strongly suggest changing up the routing such
    that TCP routes directly to 8080 are used rather than load
    balanced HTTPS routes.
  • In testing on codex2, we're still being hard limited to 2
    concurrent capi-level jobs, so attempting to boot a service
    registry with more than 2 nodes fails in this environment. It's
    uncertain if this behavior will persist in more realistic
    environments, but it is a fair bet unless we switch to more or
    less a fully async solution.
  • The serviceregistry implementation is not complete, in so far
    as neither serviceregistry.restartService nor
    serviceregistry.scaleUp are currently implemented.

Dennis Walters and others added 19 commits November 3, 2022 18:30
RegistryParams is no longer a raw map[string]interface{}

Instead, it is a struct with raw values that are json-capable.
This was failing silently in previous builds. In the release
prior to this change, it was failing *loudly*.

Now it's no longer failing.
For some reason, the somewhat aged ccv3 client that is
in our vendor bundle is unable to `UpdateApplication`
if given an application object retrieved from the API.

The error presented is "Unknown field(s): relationships"
and is being generated on the capi side of the equation.

We've worked around this by creating a temporary copy of
the app object without its relationships collection and
passing THAT temp object to the update call.

It appears that this likely also affects update_config_server
workflow, and we should double-check that.
`utilities.SafeApp(ccv3.Application) ccv3.Application`
This is to avoid doing partial service-registry creations/updates
in the event that we receive an invalid (<1) desired node count.
We confirmed that the config server update workflow is
affected by the same issue that utterly plagued the
registry server update workflow.

So we put in the same fix.
In an attempt to get registry server peering working correclty,
we're now presenting the internal connection info for each
registry server process instance as a peer.

In a perfect world, we'd be doing per-process-instance peer
configuration, but we've yet to find a way to do this.
This appears to be the only viable way to provide information
to the process instances that can be used to derive a working
service registry peering configuration.
Because it doesn't seem to be possible, let alone feasible,
to make ProcessInstances talk to each other.

This implements both create-service and delete-service, and it
is running stable regardless of the number of nodes I tell it to
make (within reason, of course).
While the primary focus of this work is to deploy a service
registry as several applications (rather than a single app that
gets scaled out), much work was also done towards making the
code as a whole easier to understand.

Notable Changes
===============

* The Config and Logger concepts are now broken out as package-
  level quasi-singletons.
* Introduced the `Implementation` interface that describes the
  surface area of the ServiceBroker interface that we actually
  need to be implemented. Realistically, anything that is a
  ServiceBroker is an Implementation, but the inverse is not
  true.
* Reified the config server broker and the service registry
  broker out into their own Implementations.
* config.Config.Services is no longer an array of Service values.
  It is now a map[string] of Service pointers. This does not
  affect the code previously written around that field, but does
  allow us to effectively have a hard requirement that each
  service be indexed by its implementation name (configserver
  and serviceregistry, respectively). Without this, we can't
  actually generalize code pathing the way that we were trying
  to.
* Updated all configserver and serviceregistry methods that
  alter the spec they return in place. As these are struct values,
  that's not actually possible. In lieu of temporarily creating
  and dereferencing pointers to the running spec, we instead just
  return a new spec with the details added in.
* Created a "failsafe" default Implementation that simply returns
  the error case for any of the methods it receives.
* `implementation.Register(topic, Implementation)` is used to
  configure the Implementation resolver for no-ask broker method
  dispatching via ...
* `implementation.For(topic)` is used to resolve the Implementation
  for a given topic. If no registered Implementation exists for the
  topic, the failsafe Implementation is returned.
* service-registry nodes have returned to environment variable
  configuration, as we no longer have to avoid restarting nodes.

Notable Issues
==============

* While this puts the blueprint in place for nodes that can talk
  to each other, there's yet another spanner in the works: the
  self-signed certs make it impossible for the nodes to talk to
  each other. Would strongly suggest changing up the routing such
  that TCP routes directly to 8080 are used rather than load
  balanced HTTPS routes.
* In testing on codex2, we're still being hard limited to 2
  concurrent capi-level jobs, so attempting to boot a service
  registry with more than 2 nodes fails in this environment. It's
  uncertain if this behavior will persist in more realistic
  environments, but it is a fair bet unless we switch to more or
  less a fully async solution.
* The serviceregistry implementation is not complete, in so far
  as neither `serviceregistry.restartService` nor
  `serviceregistry.scaleUp` are currently implemented.
@ess ess requested a review from TheDigitalEagle November 15, 2022 09:41
@ess ess self-assigned this Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant