-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate TLA+/PlusCal tree-sitter grammar? #248
Comments
@edaena, I believe you're familiar with the relevant parts of the extension. One question to answer is how this integrates with SANY. For example, tree-sitter rejecting a spec shouldn't prevent users from launching TLC if SANY is happy, and vice versa. From experience with TLAPS, we know that two parsers disagree in subtle ways (which is why there is an (incomplete) effort to replace the TLAPS parser with SANY). Related, I'd suggest thinking about integrating with the PlusCal translator, such as users modifying the PlusCal without retranslating before model-checking... Note that the proposed code navigation already exists. Related:
|
Of note, tree-sitter doesn't really reject specs; it will provide as good of a parse as it can get, and end up looking something like (sample taken from different language):
There is a flag you can check to see whether there are any error nodes in the parse tree, but unless you go hunting for them you won't know they're there. You can see the real-time parse tree (with errors or no) generated by the TLA+ tree-sitter grammar here: https://tlaplus-community.github.io/tree-sitter-tlaplus/ In the future, code navigation could be augmented by stack graphs which are built on tree-sitter grammars. I'm assuming how it works now is analogous to a string search in the file. Moving to tree-sitter grammar queries would make that a bit more robust, but you wouldn't get cross-file referencing. |
To clarify: Are you're saying that the list of use-cases is exhaustive s.t. tree-sitter should only replace regex-based syntax highlighting and improve code navigation/completion? For everything else, however, SANY remains authoritative? This then sounds reasonable, especially, if syntax highlighting and code navigation continue to work in the presence of parse failures. In the light of #239, what are the (runtime) dependencies of tree-sitter? |
Yes, SANY remains authoritative. Tree-sitter doesn't even have the capability to print error messages or anything, it's purely designed to support language tooling. I don't think my list of use-cases is exhaustive really, there are other possibilities like code folding or code formatting/linting that other people have implemented. Tree-sitter does not implement any semantic analysis or restrictions whatsoever, it is purely a syntax-level tool against which you can write queries. The lookup logic (outside of a single file) could eventually be implemented by encoding it as stack graph rules, although stack graphs still don't have good developer documentation and are only currently used for code navigation on a handful of languages on github itself. Tree-sitter grammars are all C and C++ files. The generated grammar is a 32 MB C file (src/parser.c in the repo) and the handwritten context-sensitive parsing logic for junction lists and proofs is in src/scanner.cc. When compiled this is about 3 MB in size, and it links to the tree-sitter library which is a few hundred KB in size. This can all easily be compiled with emscripten into WASM (this is what's running in the web demo I linked up above, see https://github.com/tlaplus-community/tlaplus-community.github.io/tree/main/tree-sitter-tlaplus/js). The WASM is also available as a pre-built file in the NPM module. |
Oh hey neat, that must be new! People have been asking for native vs code support for tree-sitter for a while now. Guess this is the compromise. |
The TLA+ tree-sitter grammar now supports PlusCal: https://github.com/tlaplus-community/tree-sitter-tlaplus/
You can see how to consume the grammar from TypeScript here: https://github.com/tlaplus-community/tlaplus-tool-dev-examples/tree/main/tree-sitter/typescript
There are a few ways the grammar could help:
Basically the grammar gives you access to an incrementally-updated error-tolerant queryable syntax tree. Let me know whether you think this would be a useful addition (or how else it could be used) and I will be happy to help with integration.
The text was updated successfully, but these errors were encountered: