Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix behaviour of failing conformance tests with --accept. #6714

Open
kwxm opened this issue Nov 27, 2024 · 0 comments
Open

Fix behaviour of failing conformance tests with --accept. #6714

kwxm opened this issue Nov 27, 2024 · 0 comments

Comments

@kwxm
Copy link
Contributor

kwxm commented Nov 27, 2024

The Agda evaluator isn't always completely up to date with the Haskell one, and because of this we have a feature that allows us to mark some conformance tests as being expected to fail. This uses expectFail from Test.Tasty.ExpectedFailure. For example, someof the bitwise builtins use a special size measure for budgeting that the metatheory doesn't know about at the time of writing and when you run cabal test agda-conformance you get things like this

Test suite agda-conformance: RUNNING...
UPLC evaluation tests
  evaluation
    builtin
      semantics
        writeBits
          case-32
            case-32 (evaluation): OK (0.07s)
            case-32 (budget):     FAIL (expected) (0.09s)
            [etc. etc. etc.] 

(the problem being that there are golden files containing the expected budget and the Agda evaluator currently produces results that disagree with these). When you run the haskell-confomance tests these all pass.

However (as discovered by @Unisay), if you run cabal test agda-conformance --test-option --accept then the incorrect budget results are accepted and the golden files are updated accordingly. This is arguably a bug in Test.Tasty.

Now Test.Testy.ExpectedFailure also provides ignoreTest and when you use that instead of expectFailure the output changes to

       ...
       writeBits
          case-32
            case-32 (evaluation): OK (0.06s)
            case-32 (budget):     IGNORED

Futhermore, if you use --accept then the golden files are not accepted, which is what we want. The disadvantage to doing this is that with expectFailure, when we fix the Agda evaluator to behave identically to the Haskell one, the tests which are expected to fail now start to pass, which causes an error and reminds us to remove the formerly problematic tests from the list of tests which are expected to fail; with ingoreTest they just continue to be ignored, so we might forget to update the list of problematic tests.

We should workout whether the behaviour with expectFailure and --accept really is a bug and report it if so. In the meantime we should maybe use ignoreTest instead of expectFailure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant