Structure Tools #407

CocoisBuggy · 2024-11-04T20:38:26Z

I'm experimenting with a feature for area navigation and climbing data input and I don't want to hammer the current GQL queries to get the data out, especially since I mostly just need pointers indicating the overall structure,

I've got the speed down to an acceptable time, but really it should be faster. I need to eliminate the lookup stage in the pipeline, which means embedding an ObjectId for an area's parent - I just haven't done it yet.

Queries can't run over 900ms at the moment, which I feel is probably too lenient. I'm just putting these commits here for now while I work on the rest of the PR.

Other unrelated commits

I can kill these from the PR. They are not crucial, they mostly relate to making the development environment a little more forgiving

I added shell.nix - I'm on nixos
made replica set env var optional during seeding
added a config in the vscode settings to prevent vscode extensions from mangling existing shell scripts

(https://en.wikipedia.org/wiki/Shebang_(Unix)#Version_8_improved_shell_scripts if you are sufficiently nerdy to care)

I ran into a one-off issue where the replica set specification was throwing off the script in a development environment. This code will let you un-set the variable if you don't need it while in dev.

This is by no means a complete environment, but if no one minds me leaving this here it would save me some trouble - but is not required. If you are unfamiliar with nixos, this file is intended to give my environment the necessary mongo utils / yarn / node and so forth. You can peg a node version and all sorts of various magic. I would recommend any nerds with sufficient free time investigate - but you will find it very hard to escape once it has ensnared you.

…of mono-express

The query is too slow, it remains to be seen if this is because the scope is simply too liberal - but my intuition is that this is down to the regex search and full-document retrieval Current root query time is along the lines of 16 seconds (Unthinkably slow) so I'll try another approach, otherwise this architechture is simply no good.

value? This may be down to me not manually running some cron operation in my development environment? I will work it out later.

There is still some query tuning that I want to do - I might sqaush this commit later Cost Reference Texas ~50ms California ~900ms

achieved by pruning the data being passed through the pipeline

The only way I can think of to speed this up is to eliminate the parent lookup stage in the pipeline, which seems to double the time the query takes to run until completion. The natural way to address this would be to embed a parent pointer - which I assume should be easy enough since there must be existing code to maintain the integrity of the pathTokens, ancestors So I'll take a look at doing that too

…raphql into coco-hierarchy-tools

There is a new, simple, source of truth for area structure derivations. Each area document holds a parent reference, which points to its desired reference (or none, if it is a root node). The derived fields such as - Children - Ancestry have been collated in `embeddedRelations`. the denormalized fields `ancestors` and `pathTokens` have also been colocated into a single entity path that makes propogating updates to denormalized data easier (e.g, when updating an areas name). It leaves us a space for adding new fields, and finally exposes the ACTUAL id of the document so that we can index effectively. I would suggest that using the parent field is the most straightforward and safest thing to do in pretty much all scenarios, but if you are a mongodb veteran you may find need for the array indexing features - though as far as I can tell, this is rarely the case? - Added tests - Updated some existing tests - Data returned at the gql layer should be identical, but I have not fully de-risked those tests. All I can say is that they are passing. This commit does not address #126, but gives us an enviable headstart at it. It is however needed by #407

CocoisBuggy added 13 commits November 2, 2024 11:34

shebang should be first line

4e9b177

(https://en.wikipedia.org/wiki/Shebang_(Unix)#Version_8_improved_shell_scripts if you are sufficiently nerdy to care)

For those using bash beautify vscode extension

8b66de3

Make replica set specification optional in seed script

a9c8527

I ran into a one-off issue where the replica set specification was throwing off the script in a development environment. This code will let you un-set the variable if you don't need it while in dev.

Integrate memory server into nixos environment

c0eb188

compass is a neat tool - though duplicates much of the functionality …

bf1f307

…of mono-express

I think I may have been wrong about this. Path Hash is not a cached

027b0e6

value? This may be down to me not manually running some cron operation in my development environment? I will work it out later.

Performance improvements and less jank

4794107

There is still some query tuning that I want to do - I might sqaush this commit later Cost Reference Texas ~50ms California ~900ms

minor speed improvement to the query

9c58cfe

achieved by pruning the data being passed through the pipeline

minor speed improvement to the query

c72f95e

achieved by pruning the data being passed through the pipeline

Merge branch 'coco-hierarchy-tools' of github.com:OpenBeta/openbeta-g…

671e3b4

…raphql into coco-hierarchy-tools

CocoisBuggy added the Do not merge label Nov 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Structure Tools #407

Structure Tools #407

CocoisBuggy commented Nov 4, 2024 •

edited

Loading

Structure Tools #407

Are you sure you want to change the base?

Structure Tools #407

Conversation

CocoisBuggy commented Nov 4, 2024 • edited Loading

Other unrelated commits

CocoisBuggy commented Nov 4, 2024 •

edited

Loading