feat: Improve Matching Algorithm with Matching Representation #61
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR addresses a known issue in our matching algorithm, which previously employed naive early return logic based solely on node kinds. This approach led to incorrect matchings of nodes that should not trigger the matching algorithm, such as different method declarations.
The proposed solution introduces the concept of a "matching representation" for each node to improve the matching algorithm's accuracy. For example, in the case of method declarations, this representation uses the method signature to ensure that left and right both correspond to the same method declaration, early returning with empty matchings if not. However, since not all nodes might have a decent matching representation, for non-terminal nodes, the representation defaults to the node's kind, and for terminal nodes, it uses both the kind and value.
The pull request also adds a test scenario extracted from the GitHub-API project that failed before this change and will be used for checking future regressions. Another test was removed because the change made the scenario tested unreachable.
It's worth mentioning that the matching representation acts as an identifier for the node, which we already handle in the unique label matching. However, we still need to improve the node identifier extraction, which will be addressed in a future PR.