1 / Fix issues with pattern matching for Rubi #1176

aravindh-krishnamoorthy · 2024-11-19T16:20:52Z

Overview

In this PR, fixes for pattern matching, esp. those that concern Rubi` , are collected.

Related issues

Failing patterns

Pattern a_. + b_. x_ does not match 1 + 2 x
- Likely fixed by aravindh-krishnamoorthy@85ad0f5
Pattern x_.^m_. does not match x
- Likely fixed by aravindh-krishnamoorthy@c1999ce + aravindh-krishnamoorthy@e1dee54
Pattern a_. + b_. does not match x
- Crash: mathics/core/pattern.py -> assert new_pattern is not None
- Likely fixed by aravindh-krishnamoorthy@0a89945
~~Pattern x_Integer + y_Integer z_. for 1 + 2 x~~ (Not directly relevant for Rubi)
- ~~See 1 / Fix issues with pattern matching for Rubi #1176 (comment)~~
~~Pattern with conditional within With: g[x_] := With[{}, x /; False] /; True matches for g[2]~~
- ~~Must not match.~~

(Fixes for items struck out above are moved to the next PR.)

Final review

Tests

Tests for aravindh-krishnamoorthy@85ad0f5
Tests for aravindh-krishnamoorthy@c1999ce + aravindh-krishnamoorthy@e1dee54
Tests for aravindh-krishnamoorthy@0a89945

Documentation

N/A

rocky · 2024-11-19T16:40:44Z

For the kinds of problems this fixes, please let's add a test that gets run for this. This way we can ensure no regressions to the code.

Since MatchQ[] is currently in mathics.core.testing_expressions a test case could be added to test/builtin/test_testing_expressions.py Or if you want a more isolated kind of unit test, a new file test/core/test_patterns.py would be created. Thanks.

aravindh-krishnamoorthy · 2024-11-19T16:42:18Z

For the kinds of problems this fixes, please let's add a test that gets run for this. This way we can ensure no regressions to the code.

Fully agree! I will definitely add them before I finalise this PR.

…set.

aravindh-krishnamoorthy · 2024-11-21T19:30:10Z

Pattern x_Integer + y_Integer z_. for 1 + 2 x (Not directly relevant for Rubi)

Possibly due to this optimization?

mathics-core/mathics/core/pattern.py

Lines 783 to 796 in 0d8a8f6

    
           # "leading_blanks" below handles expressions with leading Blanks H[x_, y_, ...] 
        
           # much more efficiently by not calling get_match_candidates_count() on elements 
        
           # that have already been matched with one of the leading Blanks. this approach 
        
           # is only valid for Expressions that are not Orderless (as with Orderless, the 
        
           # concept of leading items does not exist). 
        
           # 
        
           # simple performance test case: 
        
           # 
        
           # f[x_, {a__, b_}] = 0; 
        
           # f[x_, y_] := y + Total[x]; 
        
           # First[Timing[f[Range[5000], 1]]]" 
        
           # 
        
           # without "leading_blanks", Range[5000] will be tested against {a__, b_} in a 
        
           # call to get_match_candidates_count(), which is slow.

rocky · 2024-11-21T19:45:45Z

We need to get it right before we "optimize". So please fix. Thanks.

aravindh-krishnamoorthy · 2024-11-21T21:18:50Z

We need to get it right before we "optimize". So please fix. Thanks.

Sorry, @rocky, what do you mean? The mentioned optimization is not in the PR but in the mathics-core code.

rocky · 2024-11-21T21:20:52Z

We need to get it right before we "optimize". So please fix. Thanks.

Sorry, @rocky, what do you mean? The mentioned optimization is not in the PR but in the mathics-core code.

I mean please ~~put in PR to fix~~ add to this PR a fix for the mathics-core code. Thanks.

mmatera · 2024-11-22T01:00:13Z

Pattern x_Integer + y_Integer z_. for 1 + 2 x (Not directly relevant for Rubi)

Possibly due to this optimization?

mathics-core/mathics/core/pattern.py

Lines 783 to 796 in 0d8a8f6

# "leading_blanks" below handles expressions with leading Blanks H[x_, y_, ...]

# much more efficiently by not calling get_match_candidates_count() on elements

# that have already been matched with one of the leading Blanks. this approach

# is only valid for Expressions that are not Orderless (as with Orderless, the

# concept of leading items does not exist).

#

# simple performance test case:

#

# f[x_, {a__, b_}] = 0;

# f[x_, y_] := y + Total[x];

# First[Timing[f[Range[5000], 1]]]"

#

# without "leading_blanks", Range[5000] will be tested against {a__, b_} in a

# call to get_match_candidates_count(), which is slow.

If you want to deactivate the optimization, just set leading_blanks=False before the loop.

mmatera · 2024-11-22T01:01:24Z

Up to here, the changes look reasonable, but it would be nice to see some tests that pass after these changes.

rocky · 2024-11-22T01:14:52Z

Up to here, the changes look reasonable, but it would be nice to see some tests that pass after these changes.

Let me go further and explain. From my side, it is better to have many smaller fixes and PRS of specific independent bugs than wait until everything is done in several weeks/months and have one huge PR.

It is also useful to add tests after each bug fix is done rather than view this as some sort of separate and independent bullet item or task.

aravindh-krishnamoorthy · 2024-11-22T11:12:20Z

Indeed, it makes sense to have smaller PRs. I will finalise these with tests soon.

aravindh-krishnamoorthy · 2024-11-22T11:37:20Z

Pattern x_Integer + y_Integer z_. for 1 + 2 x (Not directly relevant for Rubi)
Possibly due to this optimization?

mathics-core/mathics/core/pattern.py

Lines 783 to 796 in 0d8a8f6

# "leading_blanks" below handles expressions with leading Blanks H[x_, y_, ...]

# much more efficiently by not calling get_match_candidates_count() on elements

# that have already been matched with one of the leading Blanks. this approach

# is only valid for Expressions that are not Orderless (as with Orderless, the

# concept of leading items does not exist).

#

# simple performance test case:

#

# f[x_, {a__, b_}] = 0;

# f[x_, y_] := y + Total[x];

# First[Timing[f[Range[5000], 1]]]"

#

# without "leading_blanks", Range[5000] will be tested against {a__, b_} in a

# call to get_match_candidates_count(), which is slow.

If you want to deactivate the optimization, just set leading_blanks=False before the loop.

Perhaps the removal of optionals in this PR can be reverted after we undo the leading_blanks optimization. I'll double-check that in the next PR.

mmatera · 2024-11-22T11:51:21Z

test/core/test_patterns.py

+    ("str_expr", "msgs", "str_expected", "fail_msg"),
+    [
+        # Two default arguments (linear)
+        ("MatchQ[1, a_.+b_.*x_]",None,"True",None),


Something that would be interesting is to check MatchQ over expressions of the form
A[x_.,B[y_.*z_]]
and check the match with A[1,B[2,y]], A[1,B[y,2]], A[B[y,2],1] for different settings of the attributes of A and B.

Another group of tests should check if variables are properly assigned. To do that, we should test rules using these patterns. For example,

rule=A[x_.,B[y_.*z_]]->{x,y,z}

and then apply it to the previous cases.

Agree! This makes sense.

I will add them as soon as I have all your and rocky's comments.

I'd like to see more/some unit tests for mathics.core.pattern as its own unit tests, e.g. (not via MatchQ). Other than that, I don't have any further comments.

When everything is in place, I'd like to look over.

I'd like to see more/some unit tests for mathics.core.pattern as its own unit tests, e.g. (not via MatchQ). Other than that, I don't have any further comments.

When everything is in place, I'd like to look over.

I guess it's always gonna be via MatchQ or rules, as mmatera suggested. Alternate ideas are welcome!

Also, please feel free to add commits to this PR. I think I've enabled the option.

I'd like to see more/some unit tests for mathics.core.pattern as its own unit tests, e.g. (not via MatchQ). Other than that, I don't have any further comments.
When everything is in place, I'd like to look over.

I guess it's always gonna be via MatchQ or rules, as mmatera suggested. Alternate ideas are welcome!

At some point, we should try to add tests at a lower level (using mathics.eval.patterns.match or directly testing the subclasses of mathics.core.pattern.Pattern) that avoids passing through the evaluation loop. But for now I think this is a good starting point.

mmatera

LGTM

rocky · 2024-11-22T13:27:14Z

test/core/test_patterns.py

@@ -1,6 +1,6 @@
 # -*- coding: utf-8 -*-
 """
-Unit tests for mathics pattern matching
+Unit tests for mathics.core.pattern


Here is the part that feels a little weird to me. We are testing a lower-level function using higher-level tests. I am a little bit sensitive to this kind of thing, because this less-good practice is pervasive in this code.

At one time, we didn't have pytests. There were doctests that took the place of pytests. It took us a while to get unburied from that practice.

The MatchQ tests to me should be put in test.buitin.testing_expressions.expression_tests and the rule tests put wherever rules are located.

Tests here could (and should be) to build a ExpressionPattern and run match on it.

I see. Indeed, I can do that if you and mmatera prefer that... But that would be more of a Python test than Wolfram syntax, unlike the current tests...

Here is the part that feels a little weird to me. We are testing a lower-level function using higher-level tests. I am a little bit sensitive to this kind of thing, because this less-good practice is pervasive in this code.

At one time, we didn't have pytests. There were doctests that took the place of pytests. It took us a while to get unburied from that practice.

The MatchQ tests to me should be put in test.buitin.testing_expressions.expression_tests and the rule tests put wherever rules are located.

Tests here could (and should be) to build a ExpressionPattern and run match on it.

Yes, but in any case, at some point we can rewrite these tests as lower-level tests. I do not think that these tests actually test the behavior of MatchQ or ReplaceAll, but the low-level code. I can do the translation later, when I find some time for it.

Tests in test.builtin are testing Mathics3 Builtin functions. Of course, any time one tests a higher-level function one is necessarily testing all sorts of lower-level functions. So all of these tests are valid, it is just organized, in my opinion, in the wrong place.

Tests of the ExpressionPattern match function are good because they isolate testing that from all of the other kinds of stuff, and we pinpoint the problems better. when there is a problem. A MatchQ test could fail because of something totally unrelated to the ExpressionPattern match function.

Another thing that these smaller unit tests do (which shouldn't be needed but in fact is used way too much) is to show developers how one can work with, test, and debug the smaller units. Ideally, if our APIs and modularity are done well, the code needed to do this is not that much. Personally, I have found ways to improve modularity when I have been forced to these smaller parts in isolation.

Yes, but in any case, at some point we can rewrite these tests as lower-level tests. I do not think that these tests actually test the behavior of MatchQ or ReplaceAll, but the low-level code. I can do the translation later, when I find some time for it.

My experience is that the rewrite tends to happen at a smaller rate than creating more code that is somewhat ill-conceived. In fact, I am in the middle of a larger rewrite, for operators right now. It is not fun. In fact, this aspect is what on the makes me the least happy about working on this project as compared to others.

If you want to do the minimal thing, then move the MatchQ code to where MatchQ should be tested and the ReplaceAll to where ReplaceAll tests are tested. At least, this honestly reflects what the test is doing. If later you want to rewrite, then at that time you can.

When we test the evaluation of an expression like MatchQ[1+2 a, x_.+y_.*z_Symbol], many processes are involved: the main evaluation loop, the definitions of Plus and Times and their internal implementation, the pattern matching done in the evaluation loop, and eventually, the implementation of matchQ and the pattern matching mechanisms.
If we want to know if the pattern matching works as it should, it would be better to use mathics.eval.patterns.match, which does not try to evaluate its arguments, but just implements the pattern matching.
I think that this is the right level of testing for these tests.

Lower levels involve testing whether the functions and methods used to evaluate matching do what they are supposed to. But at this point, there are functions for which I am still unable to say exactly what the expected behavior is, even after spending quite a long time trying to understand and document them.

And yes, in that case, the test should be under test/eval/

When we test the evaluation of an expression like MatchQ[1+2 a, x_.+y_.*z_Symbol], many processes are involved: the main evaluation loop, the definitions of Plus and Times and their internal implementation, the pattern matching done in the evaluation loop, and eventually, the implementation of matchQ and the pattern matching mechanisms. If we want to know if the pattern matching works as it should, it would be better to use mathics.eval.patterns.match, which does not try to evaluate its arguments, but just implements the pattern matching. I think that this is the right level of testing for these tests.

Ok. If testing Matcher.match() in mathics.eval.patterns.match is a good place to test against, then let's do that and that would be done in a test.eval.pattern module.

I am just saying that if we're going to test temporarily using MatchQ, then testing.core.patterns is not the right test module placement for this.

As written, the test is extending MatchQ tests, even if the motivation for doing this is to verify what's going on at a lower level. I leave to @aravindh-krishnamoorthy whether to rewrite using Matcher.match() inside test.eval.pattern or move the existing tests to test.builtin.testing_expressions.expression_tests and wherever Rules are testing.

Lower levels involve testing whether the functions and methods used to evaluate matching do what they are supposed to. But at this point, there are functions for which I am still unable to say exactly what the expected behavior is, even after spending quite a long time trying to understand and document them.

Then I think it was a mistake to "optimize" what is not well understood nor, as we have seen, not well covered in tests.

I think I'll always have something that's more interesting to work on (in this case Rubi). So, I want to do it right right now. I'll first try to rewrite the current tests in this PR using Matcher.match() + move them to eval. Only if it turns out to be too cumbersome, I'll fall back to the current ones and move them to expression_tests.

Thanks for your understanding and patience. Sure, it if turns out to be too cumbersome, do as you say.

In fact, the messes in this code have been so great that often I find that I can't fix up a single mess in one go, but have to break things up into little pieces. As you are learning, Pattern matching is one of those big messes.

Down the line, we will use trie-based machining akin to the Golang expreduce (and may try to copy its algorithm). In preparation for that though, having more and better pattern-matching tests in advance will help greatly.

…n Python with parse

rocky · 2024-11-23T18:22:55Z

Great - thanks!

Fix pattern matching: pop optionals.

85ad0f5

aravindh-krishnamoorthy added 3 commits November 19, 2024 21:47

Fix optional patterns without default values for the given positions.

c1999ce

Fix optional patterns without default values: Newpattern only if not …

e1dee54

…set.

Do not allow optional patterns for more than one argument.

0a89945

rocky mentioned this pull request Nov 21, 2024

Try Rubi as a git submodule #1179

Merged

aravindh-krishnamoorthy changed the title ~~Fix issues with pattern matching for expressions with two or more optionals~~ Fix issues with pattern matching for Rubi Nov 21, 2024

aravindh-krishnamoorthy changed the title ~~Fix issues with pattern matching for Rubi~~ 1 / Fix issues with pattern matching for Rubi Nov 22, 2024

aravindh-krishnamoorthy added 3 commits November 22, 2024 12:42

Add tests for the fixes.

eff7ebc

Run isort and black.

bac8e03

Merge branch 'master' into pattern-fix-rubi

e1d342f

mmatera reviewed Nov 22, 2024

View reviewed changes

aravindh-krishnamoorthy marked this pull request as ready for review November 22, 2024 11:57

mmatera approved these changes Nov 22, 2024

View reviewed changes

aravindh-krishnamoorthy added 2 commits November 22, 2024 14:11

Add extended tests for function patterns.

0f9786a

Minor housekeeping.

ae7ea50

rocky reviewed Nov 22, 2024

View reviewed changes

Move Mathics tests to toplevel; rewrite mathics.eval.patterns tests i…

8a7a564

…n Python with parse

rocky merged commit d1cf6a4 into Mathics3:master Nov 23, 2024
13 checks passed

rocky mentioned this pull request Nov 25, 2024

MatchQ[1 + 2 x, (a_. + b_. x_Symbol)] incorrectly failing #1172

Closed

aravindh-krishnamoorthy mentioned this pull request Nov 29, 2024

Bringup of Rubi section 1 Algebraic functions Mathics3/Mathics3-Rubi#2

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1 / Fix issues with pattern matching for Rubi #1176

1 / Fix issues with pattern matching for Rubi #1176

aravindh-krishnamoorthy commented Nov 19, 2024 •

edited

Loading

rocky commented Nov 19, 2024

aravindh-krishnamoorthy commented Nov 19, 2024

aravindh-krishnamoorthy commented Nov 21, 2024

rocky commented Nov 21, 2024 •

edited

Loading

aravindh-krishnamoorthy commented Nov 21, 2024

rocky commented Nov 21, 2024 •

edited

Loading

mmatera commented Nov 22, 2024

mmatera commented Nov 22, 2024

rocky commented Nov 22, 2024 •

edited

Loading

aravindh-krishnamoorthy commented Nov 22, 2024

aravindh-krishnamoorthy commented Nov 22, 2024

mmatera Nov 22, 2024

aravindh-krishnamoorthy Nov 22, 2024

aravindh-krishnamoorthy Nov 22, 2024

rocky Nov 22, 2024 •

edited

Loading

aravindh-krishnamoorthy Nov 22, 2024

aravindh-krishnamoorthy Nov 22, 2024

aravindh-krishnamoorthy Nov 22, 2024

mmatera Nov 22, 2024

mmatera left a comment

rocky Nov 22, 2024 •

edited

Loading

aravindh-krishnamoorthy Nov 22, 2024 •

edited

Loading

mmatera Nov 22, 2024

rocky Nov 22, 2024 •

edited

Loading

rocky Nov 22, 2024 •

edited

Loading

mmatera Nov 22, 2024

mmatera Nov 22, 2024

rocky Nov 22, 2024 •

edited

Loading

aravindh-krishnamoorthy Nov 22, 2024 •

edited

Loading

rocky Nov 22, 2024 •

edited

Loading

rocky commented Nov 23, 2024

1 / Fix issues with pattern matching for Rubi #1176

1 / Fix issues with pattern matching for Rubi #1176

Conversation

aravindh-krishnamoorthy commented Nov 19, 2024 • edited Loading

Overview

Related issues

Failing patterns

Final review

Tests

Documentation

rocky commented Nov 19, 2024

aravindh-krishnamoorthy commented Nov 19, 2024

aravindh-krishnamoorthy commented Nov 21, 2024

rocky commented Nov 21, 2024 • edited Loading

aravindh-krishnamoorthy commented Nov 21, 2024

rocky commented Nov 21, 2024 • edited Loading

mmatera commented Nov 22, 2024

mmatera commented Nov 22, 2024

rocky commented Nov 22, 2024 • edited Loading

aravindh-krishnamoorthy commented Nov 22, 2024

aravindh-krishnamoorthy commented Nov 22, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rocky Nov 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mmatera left a comment

Choose a reason for hiding this comment

rocky Nov 22, 2024 • edited Loading

Choose a reason for hiding this comment

aravindh-krishnamoorthy Nov 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rocky Nov 22, 2024 • edited Loading

Choose a reason for hiding this comment

rocky Nov 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rocky Nov 22, 2024 • edited Loading

Choose a reason for hiding this comment

aravindh-krishnamoorthy Nov 22, 2024 • edited Loading

Choose a reason for hiding this comment

rocky Nov 22, 2024 • edited Loading

Choose a reason for hiding this comment

rocky commented Nov 23, 2024

aravindh-krishnamoorthy commented Nov 19, 2024 •

edited

Loading

rocky commented Nov 21, 2024 •

edited

Loading

rocky commented Nov 21, 2024 •

edited

Loading

rocky commented Nov 22, 2024 •

edited

Loading

rocky Nov 22, 2024 •

edited

Loading

rocky Nov 22, 2024 •

edited

Loading

aravindh-krishnamoorthy Nov 22, 2024 •

edited

Loading

rocky Nov 22, 2024 •

edited

Loading

rocky Nov 22, 2024 •

edited

Loading

rocky Nov 22, 2024 •

edited

Loading

aravindh-krishnamoorthy Nov 22, 2024 •

edited

Loading

rocky Nov 22, 2024 •

edited

Loading