Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pinpoint where mismatches occurred in in_any_order #51

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

s3cur3
Copy link
Contributor

@s3cur3 s3cur3 commented Jan 20, 2025

This improves the usability of the in_any_order matcher when using it on non-primitive values or large lists.

The problem my team encountered was that we would get a mismatch in, say, one field of a very large struct, within a list of a half dozen of those large structs. The previous error message here amounted to "here's the data you gave us, and it didn't match." What we wanted was "here are the fields of the mismatched struct that failed to match."

Example 1: Primitives

For small lists of primitive, the improvement is minimal.

Before when given the list [1, 2] and trying to match against [1, 3]:

Before After
image image

Example 2: DateTimes

The changes here are a bigger deal when working with DateTimes. Consider this case for the code

    assert [~U[2024-01-01 01:02:03Z], ~U[2024-01-01 04:05:07Z]]
           ~> in_any_order([
             struct_like(DateTime, %{
               hour: 1,
               minute: 2,
               second: 3
             }),
             struct_like(DateTime, %{
               hour: 4,
               minute: 5,
               second: 6
             })
           ])
Before After
image image

Before, it's quite hard to tell which field didn't match, but after, it's easier to pinpoint where we received a 7 but expected a 6.

Example 3: A really big struct

Before After
image image

This improves the usability of the `in_any_order` matcher when using it on non-primitive values or large lists.

The problem my team encountered was that we would get a mismatch in, say, one field of a very large struct, within a list of a half dozen of those large structs. The previous error message here amounted to "here's the data you gave us, and it didn't match." What we wanted was "here are the fields of the mismatched struct that failed to match."

For small lists of primitive, the improvement is minimal.

Before when given the list `[1, 2]` and trying to match against `[1, 3]`:

```
     Assertion with ~> failed

     Mismatches:

       1) [1, 2] does not match any ordering of the specified matchers
```

After:

```
     Assertion with ~> failed

     Mismatches:

       1) [1, 2] does not match any ordering of the specified matchers:

         .1
           1) 2 is not equal to 3
```

It's a bigger deal when working with DateTimes. Consider this case for the code

```
    assert [~U[2024-01-01 01:02:03Z], ~U[2024-01-01 04:05:07Z]]
           ~> in_any_order([
             struct_like(DateTime, %{
               hour: 1,
               minute: 2,
               second: 3
             }),
             struct_like(DateTime, %{
               hour: 4,
               minute: 5,
               second: 6
             })
           ])
```

Before:

```
     Assertion with ~> failed

     Mismatches:

       1) [~U[2024-01-01 01:02:03Z], ~U[2024-01-01 04:05:07Z]] does not match any ordering of the specified matchers
```

After:

```
     Assertion with ~> failed

     Mismatches:

       1) [~U[2024-01-01 01:02:03Z], ~U[2024-01-01 04:05:07Z]] does not match any ordering of the specified matchers:

         .1
           1) .second: 7 is not equal to 6
```

Before, it's quite hard to tell which field didn't match, but after, it's easier to pinpoint where we received a 7 but expected a 6.
@s3cur3 s3cur3 force-pushed the improve-in_any_order branch from 3d4710a to e57b377 Compare January 20, 2025 13:58
@s3cur3 s3cur3 marked this pull request as ready for review January 20, 2025 14:00
@mtrudel
Copy link
Owner

mtrudel commented Jan 20, 2025

I like the idea here! The output is a bit confusing though, I think. Did you explore the idea of instead listing the matchers which were not matched and/or the entries which did not match? Like, in the DateTime example, an output something like:

* The following items were not matched by any matcher:
  * ~U[2024-01-01 04:05:07Z]

* The following matchers did not match any items:
  * struct_like(DateTime, %{
        hour: 4,
        minute: 5,
        second: 6
     })

My (admittedly not very considered) thought here is that this would allow the user to directly narrow down the matcher/item that's missing without having to match the message to each matcher / item manually. It's a bit tricky with this particular matcher precisely because the in_any_order aspect means that we should be focusing the messages on the content and not the index of it (the way you have your error messages formatted suggests to the reader that there is in fact a positional
aspect to the error).

On the other hand, this is a purely academic thought process on my part, whereas yours is derived from experience.

WDYT?

@s3cur3
Copy link
Contributor Author

s3cur3 commented Jan 27, 2025

Printing the matcher that failed would still be an improvement, but really only if the mismatch occurred at the second level (top level, of course, being the list itself). If you had a list of users, each of whom had posts, and the mismatch occurred in the seconds of the publication date of one of the posts, listing the matcher that failed would only tell you there's a problem [gestures] somewhere in here.

@mtrudel
Copy link
Owner

mtrudel commented Jan 27, 2025

Ah, I see what you mean.

The way I understand it, this 'locks in' an arbitrary ordering of the matchers when applying them against the failing entries to build the output, which is just one of the possible failed orderings; quite possibly not the 'best' one to help illustrate the actual underlying problem. The number of possible sortings of the matchers is combinatorial and there's no telling which of those would best 'click' for the user. The solution here is kinda painting them into a single sort.

I'm also not super convinced that the approach I outlined would fall short in the way you describe (not a diss, i'm just genuinely not sure). Let me bake up a version of that to see so we can compare.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants