Treat exceptions thrown by arguments to check as test failures #107

vkz · 2019-04-07T19:46:26Z

Problem

Rackunit family of check macros, as they are implemented atm, exhibit an
unfortunate behavior where they lose all context and report no location of failed
checks when the code being tested raises exceptions. Picture a typical rkt source
file with test-modules freely interspersed with code. In a check like this:

(check eq? (f) 42)

if f ever throws, the exception will be propagated to the programmer without much
detail and worse without any indication as to which check (perhaps of hundreds)
raised an error. Location is neither captured nor reported. Triaging such failures
quickly turns into a bisecting expedition that may be a lot of work when your have
many tests in the same file.

Here's the mechanics as I understand it. Rackunit checks are defined with
define-check macro, which takes a bunch of arguments and a check body that
constitutes the computation that decides whether the check succeeds or fails. The
macro constructs a function that performs said computation with the body wrapped
in additional context e.g. location and name of the check, so that installed
handlers can communicate the failure to the programmer. That usually works well,
however, since checks are really glorified functions they eval their arguments
eagerly. As it happens in your typical check most of the user code runs before it
ever reaches the body. During the evaluation of the arguments to the check
function no context is collected, nor does the branch that handles such errors
attempt to report anything beyond the error message (see
display-test-failure/error in format.rkt).

This issue has been discussed at some length on the mailing list:
https://groups.google.com/d/msg/racket-users/aCQwqCTY42U/NRZ_AU_YBwAJ

Proposed solution

Delay argument evaluation in checks by wrapping each in a thunk, then force the
thunks in the body of a check so that any expression passed to the check runs
within a context that can be reported to the programmer.

I believe this won't break your typical use case of rackunit and no user code
would have to change. This may not be true of any library that builds on top of
rackunit.

Problem ------- Rackunit family of /check/ macros, as they are implemented atm, excibit an unfortunate behavior where they lose all context and report no location of failed checks when the code being tested raises exceptions. Picture a typical rkt source file with test-modules freely interspersed with code. In a check like this: (check eq? (f) 42) if f ever throws the exception will be propagated to the programmer without much detail and worse without any indication as to which check (perhaps of hundreds) raised an error. Location is neither captured nor reported. Triaging such failures quickly turns into a bisecting expedition that may be a lot of work when your have many tests in the same file. Here's the mechanics as I understand it. Rackunit checks are defined with `define-check` macro, which takes a bunch of arguments and a check body that constitutes the computation that decides whether the check succeeds or fails. The macro constructs a function that performs said computation with the body wrapped in additional context e.g. location and name of the check, so that installed handlers can communicate the failure to the programmer. That usually works well, however, since checks are really glorified functions they eval their arguments eagerly. As it happens in your typical check most of the user code runs before it ever reaches the body. During the evaluation of the arguments to the check function no context is collected, nor does the branch that handles such errors attempt to report anything beyond the error message (see `display-test-failure/error` in format.rkt). This issue has been discussed at some length on the mailing list: https://groups.google.com/d/msg/racket-users/aCQwqCTY42U/NRZ_AU_YBwAJ Proposed solution ----------------- Delay argument evaluation in checks by wrapping each in a thunk, then force the thunks in the body of a check so that any expression passed to the check runs within a context that can be reported to the programmer.

vkz · 2019-04-07T19:48:57Z

Example code:

(require rackunit)

(define (f)
  (raise-result-error 'f 42 'none-given))

(check eq? (f) 42)
(check eq? 41 (+ 41 1))

Error report:

--------------------
; ERROR
name:       check
location:   play.rkt:12:0
params:     '(check eq? (f) 42)

raise-result-error: contract violation
  expected: string?
  given: 42
  argument position: 2nd
  other arguments...:
   'f
   'none-given
--------------------
--------------------
; FAILURE
; /Users/russki/Code/fcgi-rkt/play.rkt:14:0
name:       check
location:   play.rkt:14:0
params:     '(check eq? 41 (+ 41 1))
--------------------

note the location for the ERROR case.

bennn · 2019-04-08T04:22:11Z

This changes the "checks are functions" idea that the docs suggest.

Currently, the only difference between:

(check-equal? A B)

and

(let ((f check-equal?))
  (f A B))

is that the second one gives a different source location. With this PR, the second one is also going to lose the context.

I'd rather we update the docs to talk about this gotcha with exceptions, and encourage people to write more & smaller tests.

p.s. #97

vkz · 2019-04-08T08:57:21Z

Hey Ben.

I understand your concern, but tbh not seeing what you're seeing. Have you tried running your example?

;; check in identifier position
(let ((f check-equal?))
  (f 41 42)
  (f 1 2))

=>

--------------------
; FAILURE
; /Users/russki/Code/fcgi-rkt/play.rkt:18:9
name:       check-equal?
location:   play.rkt:18:9
actual:     41
expected:   42
--------------------
--------------------
; FAILURE
; /Users/russki/Code/fcgi-rkt/play.rkt:18:9
name:       check-equal?
location:   play.rkt:18:9
actual:     1
expected:   2
--------------------

with or without PR. Location there is pointing at the (f check-equal?) bit. I get the idea that you'd rather have it point at the use site, which is in the body of let, but it didn't even before this PR. In fact, unless I misunderstand the changes I made, the case with checks in macro id position (that's what allows you to "pass them as function values" should behave exactly as they used, that's because (a) I never touched that part of code and (b) macro id will have been expanded by the time the result is used, so the args remain unwrapped and the whole thing behaves exactly as it used to.

I can think of only one corner case where check macro as id may "misbehave" due to this PR. If you define a custom check that expects thunks and then does something with those thunks inside the body. IIUC, the following line from the PR would inadvertently force them - probably not something the user intended:

(set! formal (if (thunk? formal) (formal) formal))

I don't know how likely this corner case, probably not unlikely. Don't know. We can probably work around that by wrapping in a callable struct whose type we can identify, then check for that struct type rather than thunk. Easy fix, IMO.

I'm not insisting on the proposed solution, but OMG the current behavior is infuriating, to the point that it discourages me to write or rely on tests. This brought to you by me trying to figure out which test in the source file triggers an exception by essentially commenting things out, moving code and tests and code around every single time that happens. Racket isn't like Clojure, Elisp or CL in that I can't just willy-nilly eval tests one by one until I discover which'd failed. Even if I could why do something machine can do better?

Did I misunderstand your example above?

I'd rather we update the docs to talk about this gotcha with exceptions, and encourage people to write more & smaller tests.

I don't think you can go more fine grained than (check eq? (f) value) and it would still exhibit the same problem. How would smaller tests help?

rfindler · 2019-04-08T11:32:40Z

I agree about the backward compatibility concerns here. But it would be nice to have some that do capture exceptions. Can we find new names? Robby

…

On Sun, Apr 7, 2019 at 11:22 PM Ben Greenman ***@***.***> wrote: This changes the "checks are functions" idea that the docs suggest. Currently, the only difference between: (check-equal? A B) and (let ((f check-equal?)) (f A B)) is that the second one gives a different source location. With this PR, the second one is also going to lose the context. I'd rather we update the docs to talk about this gotcha with exceptions, and encourage people to write more & smaller tests. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#107 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAYWsDFS3Ewe5J1zeFvbgyiI98CLUlakks5vesP0gaJpZM4cg9cd> .

mfelleisen · 2019-04-08T12:17:08Z

Our experience with checking for beginners (test-engine/) shows that we want to deal with exceptions properly here. The idea of check-*s as functions might have looked appealing at some point, but I fail to see any in it.

I think we should create rackunit2, have it import all of rackunit, and re-export macro-based checkers with the exact same names. Then we should deprecate rackunit eventually and point people to rackunit2.

rmculpepper · 2019-04-08T12:20:21Z

In rackunit, the check forms are meant to act like functions. The test forms are the things that wrap evaluation, catch errors and continue, etc. If you write

(test-equal? "anon" 1 (/ 0))

it catches the error, prints a message, and continues. So I recommend leaving checks alone and using the test forms (like test-equal?) instead.

One obstacle to using the test forms is the extra verbosity of giving every test a name. Here's a proposal: let's allow the identifier _ in the test name position, in which case the test macro synthesizes a name like "anonymous test at file:line".

vkz · 2019-04-08T13:02:40Z

it catches the error, prints a message, and continues. So I recommend leaving checks alone and using the test forms (like test-equal?) instead.

ATM the only indication you get from the (test-equal? "anon" 1 (/ 0)) example you suggested is the anon name - no location given. Definitely an improvement over check. With anonymous tests you'd absolutely want to synthesize that location else you're back to the problem we have with checks. But then why not do it always, anonymous or not? Of course "tests are functions" make it problematic.

TBH I get why people would hesitate about what I suggested. It solves a real problem, but then it is a hack and may break people upstream. Dunno, you guys make the call. I'll make do with my patch for now lest I go crazy.

bennn · 2019-04-08T13:51:14Z

Hey Ben.

I understand your concern, but tbh not seeing what you're seeing. Have you tried running your example?

f is not going to catch exceptions as well as check-equal?

AlexKnauth · 2019-04-08T14:02:07Z

Maybe there's a less extreme version of this pull request that still achieves your goals of catching errors thrown by arguments, but doesn't need to change make-check-func and change the behavior of checks applied to thunks.

Within the output of the define-check macro, the check-impl and make-check-func should stay the same, but the define-syntax name can still change to put the argument evaluation into a context that creates a test failure, before calling the check-impl function.

In other words it shouldn't be check-impl / make-check-func's job to catch argument-evaluation failures since those should be functions that deal with already-evaluated values. But it can be the define-syntax name's job.

jackfirth · 2019-04-09T21:48:21Z

I second @mfelleisen's proposal to abandon the notion that check forms are just functions. It places far too many restrictions on the check API, and these restrictions make it difficult to provide important quality-of-life features like this pull request and #73. I think rackunit2 is a good idea.

AlexKnauth · 2020-05-22T18:55:46Z

The issue this PR was written to solve has been fixed by #109, so I am closing this PR

AlexKnauth mentioned this pull request Apr 8, 2019

Treat exceptions thrown by arguments to check as test failures, without changing function behavior #109

Merged

AlexKnauth closed this May 22, 2020

samth mentioned this pull request Jun 25, 2021

Test failures related to recent changes #144

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Treat exceptions thrown by arguments to check as test failures #107

Treat exceptions thrown by arguments to check as test failures #107

vkz commented Apr 7, 2019

vkz commented Apr 7, 2019

bennn commented Apr 8, 2019 •

edited

Loading

vkz commented Apr 8, 2019 •

edited

Loading

rfindler commented Apr 8, 2019 via email

mfelleisen commented Apr 8, 2019

rmculpepper commented Apr 8, 2019

vkz commented Apr 8, 2019

bennn commented Apr 8, 2019

AlexKnauth commented Apr 8, 2019

jackfirth commented Apr 9, 2019

AlexKnauth commented May 22, 2020

Treat exceptions thrown by arguments to check as test failures #107

Treat exceptions thrown by arguments to check as test failures #107

Conversation

vkz commented Apr 7, 2019

Problem

Proposed solution

vkz commented Apr 7, 2019

bennn commented Apr 8, 2019 • edited Loading

vkz commented Apr 8, 2019 • edited Loading

rfindler commented Apr 8, 2019 via email

mfelleisen commented Apr 8, 2019

rmculpepper commented Apr 8, 2019

vkz commented Apr 8, 2019

bennn commented Apr 8, 2019

AlexKnauth commented Apr 8, 2019

jackfirth commented Apr 9, 2019

AlexKnauth commented May 22, 2020

bennn commented Apr 8, 2019 •

edited

Loading

vkz commented Apr 8, 2019 •

edited

Loading