Skip to content
This repository has been archived by the owner on Oct 17, 2024. It is now read-only.

Scaling the caching system #112

Open
asdine opened this issue May 10, 2019 · 7 comments
Open

Scaling the caching system #112

asdine opened this issue May 10, 2019 · 7 comments
Milestone

Comments

@asdine
Copy link
Contributor

asdine commented May 10, 2019

Regula is designed to be used used by any kind of program: mobile apps, microservices, monoliths, frontend applications, etc.
Programs can use Regula in two ways:

Using the Eval API

The client runs a synchronous http call to the Regula server in order to evaluate a ruleset. It can pass parameters and specify which version of the ruleset to use. Basically, this can be summed up as the following curl command:

curl "http://localhost:5331/rulesets/some/path?eval&param1=a&param2=b"

By embedding Regula

The program imports Regula as a library, loads a bunch of ruleset definitions from the server, caches that in memory and evaluates everything locally.

The issue with cache prefilling

The second option is great to avoid network round trips but the "loads a bunch of ruleset definitions from the server" is problematic.

Currently, the client can load these rulesets by selecting a prefix:

Every ruleset whose path begins with this prefix will get downloaded, including all the different versions of these rulesets.

Of course, an empty prefix is a valid one, meaning that it is possible to tell the client to download every possible ruleset including their precedent versions.

This obviously doesn't scale at all.

This could be solved by selecting a long prefix to narrow the results, but any node of a tree is a tree itself so it would only help providing that we are sure that node doesn't grow big.

Let's review our requirements

Regula was designed with a set of requirements in mind and I believe that wanting to absolutely comply with all of them will lead to a bad design. Here is the list:

  • Only one network round trip (to fill the local cache)
  • Have access to any ruleset regardless of its path in the tree without any additional round trip
  • Have access to any precedent version of any ruleset regardless of its path in the tree without any additional round trip
  • Being notified if any new version is created so the program can update its cache

By brainstorming with @tealeg, it appeared to us that we focused too much on making sure every ruleset is available locally just in case you might need it, whereas in fact people actually know what rulesets they need within their program.

Here is a revised list of requirements that in our opinion scales much better, while providing a great experience:

  • Only one network round trip (to fill the local cache) for selected rulesets
  • Have access to any ruleset regardless of its path in the tree without any additional round trip
  • Have access to any precedent version of any ruleset regardless of its path in the tree without any additional round trip
  • Being notified if any new version is created so the program can update its cache

With these new requirements, we can provide the following solution.

The solution

In order to provide a solution that works in multiple situations, I will first describe the default behavior then talk about customizations that will lead to satisfying the requirements listed above.

Default client

The Regula client, with its default configuration, will act as a simple HTTP client library and will run an HTTP request every time the program wants to evaluate a ruleset.
The client will allow evaluating any version of any ruleset.

Caching option

With the caching option enabled, any ruleset evaluated for the first time will:

  • get downloaded instead of being evaluated remotely
  • cached
  • evaluated locally

Subsequent calls to the same ruleset will use the cached version

Caching option + Watch option

Same as the previous one but any cached version will get updated if a new version is created in the server.

Prefill mechanism

The program can ask the client to prefill its cache with a defined set of rulesets.

  • no prefix, complete paths must be provided
  • only one round trip would be performed to fetch all of the selected rulesets using a Batch API

Solution analysis

Let's explore various scenarios using the solution described above:

Scenario What to use
I don't care about network round trips Use the default Regula client or any http client
I want no round trips besides the first one Use the prefill option to declare all the rulesets you use in your program
I want no round trips besides the first one and I want my cache to be updated if new versions are created in the admin Use the prefill and the watch option
I'm storing the version of the ruleset used in a database to reuse it again and I want to avoid as much round trips as possible Use the cache + prefill option. All the latest versions of the rulesets used will be evaluated from the cache. Previous versions will do too as long as the program is not rebooted. If the program is run and expects to use an old version, there will be a cache miss and a network round trip will be necessary to fetch that ruleset version
I'm storing the version of the ruleset used in a database to reuse it again and I want absolutely no other round trips No solution provided by Regula

Conclusion

I'm certain that by slightly changing the requirements we can provide a scalable solution that still works for our use case.

@drommk @christophe-dufour @genesor does that still work for you?

@asdine asdine added this to the v0.7.0 milestone May 10, 2019
@yaziine
Copy link

yaziine commented May 10, 2019

I think that for the "Prefill mechanism" we should provide a way to use prefixes.

Let's say that we want a complete node filled by a lot of rulesets, instead of listing them all, what prevents us to retrieve the node entirely?

@asdine
Copy link
Contributor Author

asdine commented May 10, 2019

For various reasons:

  • Why download rulesets you won't use? if you are not using them why bother downloading them?
  • It's not scalable for the reasons I explained above
  • It would make us code and maintain a complex API for no good reason (that's the point of this issue, to avoid writing that specific API)

@drommk
Copy link

drommk commented May 13, 2019

I'm good with this approach for a generic lib-level solution.
Probably not an issue, but let's keep in mind that it limits direct discoverability from the client POV though.

@asdine
Copy link
Contributor Author

asdine commented May 13, 2019

Probably not an issue, but let's keep in mind that it limits direct discoverability from the client POV though.

Indeed, but I think that's what created this issue in the first place, we mixed serving rulesets for cache purpose with discoverability.
Decoupling them will allow both APIs to scale much better

@qmathe
Copy link

qmathe commented May 13, 2019

For the mobile side, the prefill mechanism should be ok.

There is one downside though, it's going to require us to type each ruleset path twice (download + evaluation). We could use a constant to avoid harcoding each path as a string twice. Do you see a way to avoid this or do you consider its an acceptable trade-off in term of APIs?

More generally speaking, I'm not yet convinced we should always avoid prefixes or some other tagging mechanism to indicate which rulesets to download. The main issue I see with prefixes as they exist, is that they force the ruleset tree structure to encode both:

  • semantic organization
  • download/cache boundaries

imo splitting the ruleset tree into downloadable/cachable subtrees should not be handled by the existing tree structure as it is. I'm not sure this responsability should be entirely shifted to the client side though. Did you consider tagging rulesets directly when writing them or supporting rulesets appearing under multiple paths? For example, we could have tags like mobile or service names, then on the client side instead of using a prefix we would use one more tags to download/cache rulesets. Each ruleset could be required to have at least one tag.

As a side note, downloading only the latest ruleset versions on mobile sound good enough btw.

@drommk
Copy link

drommk commented May 14, 2019

Did you consider tagging rulesets directly when writing them or supporting rulesets appearing under multiple paths?

The idea is appealing but I think that would be a slippery slope, because in practice you'd end up including consumer logic into your rulesets (eg having "mobile" tags on rulesets)

Again, I think that any heetch-app-oriented optimization should be handed by the gateway, not regula itself.

I agree that the double writing is painful (that's what I called the discoverability issue) but not something we can't live with until we have the need to build heetch-app-oriented optimizations

@qmathe
Copy link

qmathe commented May 15, 2019

The idea is appealing but I think that would be a slippery slope, because in practice you'd end up including consumer logic into your rulesets (eg having "mobile" tags on rulesets)

Makes sense.

From an implementation/storage standpoint, I agree that tags should be stored outside the ruleset tree and the tag notion should not exist in the Regula main API. They could be part of the Batch API outlined by Asdine though.

Even if tags are not part of Regula, exposing the ability to tag rulesets in Regula UI editor is what matters imo (rather than introducing a distinct tool or web app).

@asdine asdine mentioned this issue May 22, 2019
@asdine asdine self-assigned this May 28, 2019
@asdine asdine removed their assignment Apr 20, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants