Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AST dis- and reassembler: an alternative for hydrating #1632

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

Lotes
Copy link
Contributor

@Lotes Lotes commented Aug 15, 2024

DRAFT: But feel free to look through. I need to add tests and a comparison of the resource consumption of Hydrator vs. Dis-/Reassembler.

Hydrating is currently used to send an entire AST (including the CST) from the async parser worker back to the main language server thread. This is done by transforming the AST/CST into serializeable representation, send it via messaging to the parent thread and transforming back this serializable representation back to an AST/CST structure. We end up (more or less) with 4 copies of the same structure. Especially for big files, this can become a bottleneck, since worker threads seem to have a memory limit of 4 GB.

With the approach given by this PR, I try to serialize the build process of the AST/CST, ending up in only 2 copies (more or less). On the worker thread side the AST gets traversed and disassembled to build instructions. The instructions are send over to the parent which can start building single AST and CST nodes.

The structure of the instruction flow is like this:

  1. Allocate (sets up the build process, by allocating an array of AST and CST node slots, plus context information)
  2. Send over CST nodes one by one (distinguish between root, inner and leaf nodes; sometimes it is necessary to pop the internal stack).
  3. Send over AST nodes one by one, property by property.
  4. Send over the parser and lexer errors
  5. Return the root AST node.

And that is the whole magic.

Currently there are only 5 tests that were used before for the hydrator.
I think I need to add more, such that every instruction was tested at least once (that is why it is still a draft).

Plus, I do not know anything about performance yet. I was just writing down the code. I have some data which I can use to test against, but let's see when I find the time for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant