Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve data delivery format #206

Closed
thommcgrath opened this issue Jul 1, 2020 · 2 comments
Closed

Improve data delivery format #206

thommcgrath opened this issue Jul 1, 2020 · 2 comments
Assignees

Comments

@thommcgrath
Copy link
Owner

The current data delivery format cannot scale. It's too much data to load on demand, and recently became too much data for memcached to store. Compression solved that problem, but it'll only last for so long. Beacon needs a more scalable way to deliver data.

My initial idea was to segregate data by mod which the client could subscribe to individual mod files. Don’t need Super Structures? Then Beacon never downloads the data for it.

This sounds good on the surface, buts runs into dependency issues. Remember that each DLC is a mod. So if an engram in mod A is crafted with an engram in mod B, the database would error if mod B wasn’t loaded first. This problem would rapidly become nasty, especially as mods will reuse things from other mods, such as spawn points.

Pagination might be a way to solve this. In theory it would be infinitely scalable. The data file would contain a link to make a request to get more data. The trouble here will be packing the initial data together for offline usage support. The engrams data must be available on the downloads page, and multiple files won’t play nice with that.

My next idea is to utilize the CDN. Beacon engram updates would run on a delay, processing every fifteen minutes or so. The server would determine if a new dump is needed, produce it, then upload to the CDN. All the download links would become redirects to this file. Concerns are caching problems - the CDN is setup to cache files for a full year as the data shouldn’t be changing - and sync issues. This problem doesn’t help sync, which could also produce massive dumps. Even if sync requests became full dumps of the date is too old, that still leaves potential for massive requests.

@thommcgrath thommcgrath self-assigned this Jul 1, 2020
@thommcgrath thommcgrath pinned this issue Aug 10, 2020
@thommcgrath
Copy link
Owner Author

I've been working on this problem using the CDN route. Solving caching and sync problems has turned out to be surprisingly easy. The process that prepares each "delta" file will grab all the changes between the newest timestamp and the previous delta file. Then it all gets compressed and dropped on the CDN. When a sync needs to happen, the client downloads each file between their last update and the newest update, and applies them in order. The process also prepares a "Complete" file with every current object, and updates a file on the CDN. The url includes a query string with the modification time, to invalidate the previous cache so that every request for the complete dump always gets the newest version.

This process is mostly complete and is expected to be put to use for Beacon 1.5.

@thommcgrath
Copy link
Owner Author

This was implemented in Beacon 1.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant