Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make useExperimentalModernTreeSitter the default… #855

Conversation

savetheclocktower
Copy link
Contributor

@savetheclocktower savetheclocktower commented Jan 6, 2024

…and create useLegacyTreeSitter for those who want to opt into the previous default behavior.

This will be a draft PR until the CI passes.

The bulk of meaningful changes here occur in src/grammar-registry.js and in the grammar-selector package. We need to invert the logic that previously privileged node-tree-sitter over web-tree-sitter.

The rest of the changes occur in the specs. In many cases, the explicit setting of core.useExperimentalModernTreeSitter can just be removed, since the scenario that was being tested against is now the default. If specs test all three modes, then the setting of core.useLegacyTreeSitter is handled.

@savetheclocktower
Copy link
Contributor Author

The snippets failures are known — that's what snippets#18 is for.

The autocomplete-html tests are failing because the renderer process is crashing. I can't reproduce that one locally! What fun.

@savetheclocktower
Copy link
Contributor Author

OK, I think I fixed autocomplete-html.

A lot of these specs have been dumb busywork to fix, but about 10% of them have been small but important fixes that I'm glad we had spec coverage for. We needed a way to disable async parsing for specs — constantly doing await languageMode.atTransactionEnd() is a buzzkill — and we needed a few extra checks for race conditions that, while much more likely to happen in an artificial testing environment, could still manifest every once in a while in regular usage.

@savetheclocktower
Copy link
Contributor Author

I could definitely use some help with the “Build Pulsar Binaries” failure on Ubuntu. The Playwright integration test isn't passing and I can't even get to a point locally where I can diagnose it. Running npx playwright test integration/workspace.spec.js requires that I not have a copy of Pulsar running (because I get the IndexedDB error), and the test fails much earlier locally than on CI.

I could also still use a reviewer for snippets#18. But I know it's Sunday, so no rush.

@savetheclocktower savetheclocktower force-pushed the modern-tree-sitter-on-by-default branch from 214e018 to c089eaf Compare January 7, 2024 23:41
@DeeDeeG
Copy link
Member

DeeDeeG commented Jan 9, 2024

So, regarding running the tests locally, I find when it opens the editor it needs to see the "Welcome" tab that says "a hyper-hackable text editor". It recognizes that to see the editor is loaded so it can close tabs and start making ones in various languages to test highlighting.

If you have disabled these "Welcome" tabs from showing at startup, maybe that breaks the playwright test scenario?

Furthermore, I notice Pulsar is kind of inconsistent about how it loads these two welcome tabs, one of which playwright needs to see per how the test scenario is written right now. Sometimes it loads the tabs both visibly, next to one another, one per pane. Other times it groups them into the same pane with only one foregrounded/focused. You can manually click over to the tab that says "a hyper-hackable text editor" and has the Pulsar logo, and playwright will notice that and resume the tests. (If this takes too long or the command palette is accidentally de-focused / dismissed by this clicking around, you may have to ctrl + c playwright in your terminal and try again to orchestrate it right. I also recommend closing any remaining open tabs after a failed attempt, if possible, so they don't end up focused and interfere with the next test run.) It was fiddly getting these steps down, but it is doable for me with some but not great reliability.)

@savetheclocktower
Copy link
Contributor Author

By my recollection, that “Welcome” screen did appear. It was stuck waiting for the presence of the .tab-bar selector, but I could see that tabs were present on the screen.

Relatedly: I think the integration tests should launch with a custom ATOM_HOME so that my own settings don't come into play. After I tried to diagnose this failure, I relaunched my own instance of Pulsar and found that all my disabled packages had been re-enabled.

@DeeDeeG
Copy link
Member

DeeDeeG commented Jan 9, 2024

As for the HTML test failing:

There is a subtle difference in scope if you place the cursor over the trailing " quotation mark:

Subtly different scope result for trailing quote mark in modern tree-sitter HTML vs legacy tree-sitter HTML grammars

Modern tree-sitter HTML grammar shows something (a "node"? Is that the term) with three quote marks """, whereas legacy tree-sitter HTML grammar has that thing ("node"?) as just a single quote mark ", and I think that may be disagreeing with the test scenario as written.

As for the Shell script test failing:

    expect.toHaveText: Error: strict mode violation: "atom-text-editor.is-focused .syntax--variable >> :scope:has-text("a")" resolved to 2 elements:
        1) <span class="syntax--variable syntax--other syntax--me…>…</span> aka playwright.$(".syntax--variable >> nth=0")
        2) <span class="syntax--variable syntax--other syntax--sh…>a</span> aka playwright.$(".syntax--variable > .syntax--variable")

See: https://stackoverflow.com/questions/76043713/error-locator-click-error-strict-mode-violation-getbyrolebutton-name

There appears to be a playwright "locator" (locator.click) in use, that needs to operate on only one element, but the selector being called to find the click target returns two elements instead. So, there is some subtle difference in the behavior of the legacy vs modern tree-sitter grammars for shell scripts, I reckon, and reading the selectors in the test scenario (see error output snippet above) should presumably be the guide for where to follow up on this?

@DeeDeeG
Copy link
Member

DeeDeeG commented Jan 9, 2024

I think the integration tests should launch with a custom ATOM_HOME so that my own settings don't come into play.

Totally agree. That would be a great follow-up enhancement after this PR, hopefully it's not too hard for one of us to set up. I haven't touched the playwright stuff much before, to be honest. Hopefully there's a way to get this running with a custom env var some way, shouldn't be too hard I hope.

@savetheclocktower
Copy link
Contributor Author

Oh, Christ, I see it now. In the old Tree-sitter HTML grammar, given this HTML…

<foo type="10"></foo>

…the string scope was applied only to 10, rather than to "10". This is ridiculous. Just pushed a fix.

There's a different problem for the shell — a variable was given two different variable scope names. Also fixed. Fingers crossed.

I have no idea why I didn't just try to diagnose the problem more deeply from the CI output instead of just giving up when I couldn't reproduce the outcome on my local machine.

@DeeDeeG
Copy link
Member

DeeDeeG commented Jan 9, 2024

To be honest, without reproducing it on my machine I fully suspected a CI-only issue as a first assumption. Luckily my local output was identical after I finished wrangling with playwright, so I felt confident digging into that rather tall wall of error output. And my contribution here may just be being luckier with (or having more spare bandwidth to deal with) troubleshooting Playwright, for the most part.

EDIT to add: The syntax highlighting of the output, when seeing the playwright result output locally is very helpful, easier to skim what is actionable and what is just reporting passed tests or other nominal or verbose but not key info. Was a key maybe necessary step working out how to appease Playwright/the test scenario, IMO.

Happy to be of help getting some of these neat improvements over the line to being default.

And it is indeed nice we had tests to verify some of these things, however sparse the tests were.

(Waiting on the next CI run to pass, but at least it wasn't CI-only shenanigans after all. CI should just work, it's sad when it doesn't.)

@DeeDeeG
Copy link
Member

DeeDeeG commented Jan 9, 2024

I do see the editor launches with a custom ATOM_HOME at tmp/atom-home-tests subdir under the current dir. At least for the openAtom() function in integration/helpers.js.

It does load from there, so I'm not sure what the deal is with config seeming to leak in from outside? Not sure what's going on there. I did see it re-launch if there was any error, maybe that has something to do with it? Or maybe some info that's under one of the other dirs other than ATOM_HOME is leaking in? Like under ~/Library/Application Support for macOS... C:\Users\[User]\AppData on Windows?

@savetheclocktower
Copy link
Contributor Author

I probably launched it wrong. S'ok. Meanwhile there's one failure still, but it's somewhat legit, so I'm still working on it.

@savetheclocktower savetheclocktower marked this pull request as ready for review January 10, 2024 00:04
@savetheclocktower
Copy link
Contributor Author

EVERYTHING IS GREEN

Let's get this reviewed and landed before something in CI decides to retroactively fail

@DeeDeeG
Copy link
Member

DeeDeeG commented Jan 10, 2024

Had some dinner, am back.

It's kind of a long diff, working my way through it, looks good so far... Will have more in the way of "review" later.

And of course: Thank you for doing all this!

Copy link
Member

@DeeDeeG DeeDeeG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have gotten about half-way through reviewing this PR's diff (in-between some other things that came up). This is what I have for now.

I hope to be able to finish soon (tomorrow), but by all means I encourage anyone else interested in reviewing the PR to do so, don't wait on my behalf.

Looking good so far and I'll see about the rest soon.

Comment on lines -17 to 18
{language: "HTML", code: '<foo type="10"></foo>', checks: {string: '10'}},
{language: "HTML", code: '<foo type="10"></foo>', checks: {quoted: '"10"'}},
{language: "XML", code: '<foo type="10"></foo>', checks: {string: '"10"'}},
Copy link
Member

@DeeDeeG DeeDeeG Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT: Honestly reading this back it seems nitpicky. This is just the first thing I noticed while poring over the diff. I'm gonna say no action needed, no response needed, optional to even consider this. Don't worry about it.

CI was green before this last tweak, and this makes it test something different than the XML tests for.

Is this right, or would it be more ideal to revert and test expecting the same outcome on the XML and HTML grammars for this identical syntax?

(If I understand correctly we have some flexibility here, since either way to write this spec line should pass? I'm not a grammar expert, I leave this to the author or other reviewers, such as there may be any additional reviewers.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how the XML one passes.

Before this change, the test was choosing the first occurrence of .syntax--string in the editor. The HTML test was failing because it was matching the .syntax--string class name in the opening quote (scoped as punctuation.definition.string.begin.html) before the one that matched all of "10" (scoped as string.quoted.double.html). Telling it to target a different segment of the scope name I wanted has the same effect, but avoids the false positive of punctuation.

"snippets": "github:pulsar-edit/snippets#ba70705",
"snippets": "github:pulsar-edit/snippets#6b9163415270b8757b262f5dfaafaa05d47324a9",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

pulsar-edit/snippets@ba70705...pulsar-edit:snippets:6b9163415270b8757b262f5dfaafaa05d47324a9

(It can be helpful for Changelog purposes if we can add pulsar-edit/snippets#15, pulsar-edit/snippets#16 and pulsar-edit/snippets#18 to the PR body, but I am noting them here too.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't even realize we'd never bumped snippets since those changes.

integration/workspace.spec.js Outdated Show resolved Hide resolved
Comment on lines -363 to +428
editor.setText('<marquee behavior="')
editor.setCursorBufferPosition([0, 19])

completions = getCompletions()
expect(completions.length).toBe(3)

expect(completions[0].text).toBe('scroll')
expect(completions[1].text).toBe('slide')
expect(completions[2].text).toBe('alternate')

editor.setText('<marquee behavior=\'')
editor.setCursorBufferPosition([0, 19])

completions = getCompletions()
expect(completions.length).toBe(3)

expect(completions[0].text).toBe('scroll')
expect(completions[1].text).toBe('slide')
expect(completions[2].text).toBe('alternate')
// NOTE: The Tree-sitter parser goes absolutely mental in this scenario. It
// presents a much more reasonable tree when the closing quote is present,
// but it'd be incredibly hard to salvage something from this mess.
//
// This isn't ideal, but most users will have enabled smart typing pairs,
// so we'll let this slide.
// editor.setText('<marquee behavior="')
// editor.setCursorBufferPosition([0, 19])
// await languageMode.atTransactionEnd()
//
// completions = getCompletions()
// expect(completions.length).toBe(3)
//
// expect(completions[0].text).toBe('scroll')
// expect(completions[1].text).toBe('slide')
// expect(completions[2].text).toBe('alternate')
//
// editor.setText('<marquee behavior=\'')
// editor.setCursorBufferPosition([0, 19])
// await languageMode.atTransactionEnd()
//
// completions = getCompletions()
// expect(completions.length).toBe(3)
//
// expect(completions[0].text).toBe('scroll')
// expect(completions[1].text).toBe('slide')
// expect(completions[2].text).toBe('alternate')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No action needed, just commentary for posterity.

Just pointing out for posterity that these were commented out.

Seems like it's still nice to have tree-sitter parsers for this stuff, and also seems basically reasonable to expect people to close their quotes to get stuff highlighted properly? I'm not really versed in parsers/grammars to be honest, just like to have some contemporary discussion at the time when commenting stuff out so people can look back and see what the reasoning was.

As a reviewer I am personally okay with this, but other reviewers (if any) feel free to chime in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on the record (admittedly in random places, like comments of issues and PRs) as wishing that Tree-sitter gave the developer better control over error handling so that it doesn't go completely sideways in weird scenarios. I'm not sure it's even the right tool for an autocompletion use case.

In general, even if smart typing pairs weren't a usability enhancement, it'd still be a huge win for ease of parsing. Folks who have them disabled must be used to seeing their editor momentarily change lots of token colors in the time between typing the opening and closing quotes of a string.

});

it('shows two if false (in proper order when language parser is wasm-tree-sitter)', async () => {
await atom.packages.activatePackage('language-c'); // punctuation making it sort wrong
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not super important, but looks like this comment was copy-pasted from the existing spec just above. Any idea what this means?

// punctuation making it sort wrong

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No clue. :)

packages/grammar-selector/spec/grammar-selector-spec.js Outdated Show resolved Hide resolved
Co-authored-by: DeeDeeG <[email protected]>
Copy link
Member

@DeeDeeG DeeDeeG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks basically good, I had one or two small feedbacks that could be addressed potentially, but I don't want to block this PR.

I didn't see anything that made me think "this needs to be held back" just some small tidying up suggestions, and a few questions of things I was curious about.

Willing to Approve at this point, since I didn't see a reason to object, and I probably won't be able to verify whether any further improvements are valid since it's a bit above my head regarding knowledge of grammars/parsers.

Comment on lines 5 to 7
beforeEach(function() {
atom.config.set('core.useTreeSitterParsers', false);
waitsForPromise(() => atom.packages.activatePackage("language-gfm"));
Copy link
Member

@DeeDeeG DeeDeeG Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Stuff like this is just because the specs were written for the a textMate or Legacy Tree Sitter grammar (and we don't want to re-write the spec to work against the modern tree-sitter grammar that also exists), right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my feeling is that we can write the specs for the new parsers now that things have settled down, but by and large they don't exist yet. A lot of the first passes on things like language-go and language-c++ were basically my reckonings on how things should be scoped in languages that I didn't have a lot of familiarity with, so I didn't want to have to change a bunch of unit tests later.

But a lot of these specs were written against TextMate grammars and still have value — because TextMate grammars aren't going anywhere.

atom.config.set('core.useExperimentalModernTreeSitter', false);
atom.config.set('core.useLegacyTreeSitter', true);
Copy link
Member

@DeeDeeG DeeDeeG Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Ditto, this just ensures we run a spec written for the old grammar against the old grammar not its modern replacement, thus avoiding the need to re-write the whole spec file for subtle differences in the grammars' behavior, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. For this one, we'll probably end up adapting it instead of starting from scratch. By my recollection, most languages don't even have tests for legacy Tree-sitter mode.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify my understanding: The .eslintrc.js files are handy to inform IDEs how to auto-format and lint stuff, right? (Besides for people to manually run eslint on CLI, which I'm not sure anyone does for Pulsar core on any regular basis.)

I'm not against them, but there's a lot of them, so I wanted to check I've got the right idea of what they're mainly intended for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I've got linter-eslint-node installed locally and it complains about a lot of stuff unless I drop .eslintrc files in the spec directories.

scopeName: 'source.regexp.js'
scopeName: 'source.js.regexp'
Copy link
Member

@DeeDeeG DeeDeeG Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No action needed, but just curious: Why was this changed? Is there a convention about how these are usually named?

EDIT: A comment in another file in the PR explained it, pretty much. I wouldn't mind any notes you can give, but I trust you have this sorted, TBH.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it because one of the unit tests made me realize that source.js.regexp was what it was called in old Tree-sitter.

The root scope names aren't a huge deal because they often get concealed in these kinds of injections, but it is more conventional to put the regexp part last here because it's a regex parser, not a JavaScript parser.

Comment on lines 5 to 7
beforeEach(function() {
atom.config.set('core.useTreeSitterParsers', false);
waitsForPromise(() => atom.packages.activatePackage("language-yaml"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto the above, this spec is written for a TextMate grammar, whereas a Modern Tree-sitter grammar also exists to highlight Yaml, I gather? So we need to set this to run the specs against the TextMate one?

Comment on lines 104 to 110
atom.config.set('spell-check.excludedScopes', ['.functio.entity']);
atom.config.set('spell-check.excludedScopes', []);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this is about? Was the exclusion of '.functio.entity' superfluous 'cause nothing was labeled with that scope in the first place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my hunch. I don't know if it was intended or not, but since the effect is the same as just blanking out the setting, that's what I went with.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this was testing the scenario where you "go to exclude a scope name that is theoretically valid, but not in use, due to a user typoing the scope name" which just seems super unlikely to cause errors, but who knows??

Not a big deal, but if testing that scenario is of any use, then we can revert this bit?

spec/workspace-spec.js Outdated Show resolved Hide resolved
src/config-schema.js Outdated Show resolved Hide resolved
spec/wasm-tree-sitter-language-mode-spec.js Outdated Show resolved Hide resolved
src/tree-sitter-language-mode.js Outdated Show resolved Hide resolved
@savetheclocktower
Copy link
Contributor Author

All feedback addressed! Let's see if the CI passes.

Copy link
Member

@DeeDeeG DeeDeeG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI passed (again)!

Re-approving to clarify review status is still 👍 from me.

I would have liked to have given more thorough review of the actual grammar/parser stuff, but I just don't have the knowledge to do so.

I'm glad to see CI passing, and would encourage the team to use Rolling after this lands, to the extent they are using Pulsar day-to-day, between now and next planned Regular release. That way we can daily-drive and find anything we would want to change via first-hand-use testing.

(Edit to add a NOTE: I did have one last possible actionable review feedback above, so if that lands and CI is happy again I can mark re-approved, but no strict need to wait as I extend my review-approve to that as well if it does land and I don't notice it right away.)

@savetheclocktower
Copy link
Contributor Author

OK, one last run through CI. If it fails, I'm going to set my computer on fire. The stakes are high!

@DeeDeeG
Copy link
Member

DeeDeeG commented Jan 11, 2024

(If it fails, then I endorse undoing the last change and just ship it, lol. Fingers crossed, though.)

Copy link
Member

@DeeDeeG DeeDeeG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@savetheclocktower savetheclocktower merged commit 9cf8691 into pulsar-edit:master Jan 11, 2024
103 checks passed
@DeeDeeG
Copy link
Member

DeeDeeG commented Jan 11, 2024

P.S. thanks for dealing with all the review comments! Glad to see this pref on by default now, and the surprising amount of tweaks it took to specs to handle the new system, miscellaneous actual bugfixes too. Many thanks.

@savetheclocktower
Copy link
Contributor Author

No problem! Most of the work on the specs should've been done ages ago — I just forgot about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants