Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{$IFDEF} and {$IFNDEF} inside a type declaration causes incorrect node holding the rest of the text and macros #7

Open
vintagedave opened this issue Jul 14, 2024 · 2 comments

Comments

@vintagedave
Copy link

vintagedave commented Jul 14, 2024

Hello,

Testing against the following Delphi code:

        unit Ifdefs;
        interface
        type
            {$IF DEFINED(MSWINDOWS)}IWindowsOnly = interface end;{$ELSE}ISomethingElse = interface end;{$ENDIF}
            {$IFDEF NOTDEFINED}
                INotDefined
            {$ELSE}
                IInverse
            {$ENDIF} = interface end;

I get some odd node results for the IFDEF part.

  • The $IFDEF comes in as a node type "pp", as expected
  • The following node is a declType, as expected
  • But the node's first child is type "genericDot", representing the whole "b'INotDefined\n {$ELSE}\n IInverse'" section.

In other words, it doesn't seem to recognise the type name, and the $ELSE is lost. (I originally found this with $IFNDEF, so it seems to apply to both.)

On the other hand, the preceding declaration beginning $IF DEFINED does seem to generate the expected nodes. It may be that this is because they are entire typename = definition clauses, whereas the failing ones have IFDEF logic inserted in the middle?

@vintagedave
Copy link
Author

By the way, one of the ancestor nodes here is also of type ERROR. I see this quite a bit parsing complex units (eg Winapi.Windows.pas.) It doesn't seem to affect things badly; I ignore ERROR nodes when using the nodes.

@Isopod
Copy link
Owner

Isopod commented Aug 7, 2024

I think the ERROR node for this particular snippet is because it’s missing and end.. But I can confirm that the node is genericDot. All things considered, this is not the worst possible outcome! The parser thinks this is a INotDefined.IInverse where someone has forgotten the dot. At least this is somewhat reasonable and localized.

Preprocessed language are impossible to parse using Tree-Sitter. Tree-Sitter is built on the theory of context-free grammars, and having a preprocessor is the opposite of context-free. It can’t work out of principle. I tried to make some common cases work, so that it at least doesn’t break the whole parse tree, but even then it is not always successful. Adding more special cases, in my experience, does more harm than good, as it tends to confuse the parser even more when it does encounter something it cannot parse, and it also bloats up the generated parser. See this comment in grammar.js:

// A word of caution: It is tempting to sprinkle this macro in many more

If you need a parser that is 100% accurate, you’re probably best off using something else. The intended goal of Tree-Sitter is syntax highlighting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants