-
-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Treat exceptions thrown by arguments to check as test failures #107
Conversation
Problem ------- Rackunit family of /check/ macros, as they are implemented atm, excibit an unfortunate behavior where they lose all context and report no location of failed checks when the code being tested raises exceptions. Picture a typical rkt source file with test-modules freely interspersed with code. In a check like this: (check eq? (f) 42) if f ever throws the exception will be propagated to the programmer without much detail and worse without any indication as to which check (perhaps of hundreds) raised an error. Location is neither captured nor reported. Triaging such failures quickly turns into a bisecting expedition that may be a lot of work when your have many tests in the same file. Here's the mechanics as I understand it. Rackunit checks are defined with `define-check` macro, which takes a bunch of arguments and a check body that constitutes the computation that decides whether the check succeeds or fails. The macro constructs a function that performs said computation with the body wrapped in additional context e.g. location and name of the check, so that installed handlers can communicate the failure to the programmer. That usually works well, however, since checks are really glorified functions they eval their arguments eagerly. As it happens in your typical check most of the user code runs before it ever reaches the body. During the evaluation of the arguments to the check function no context is collected, nor does the branch that handles such errors attempt to report anything beyond the error message (see `display-test-failure/error` in format.rkt). This issue has been discussed at some length on the mailing list: https://groups.google.com/d/msg/racket-users/aCQwqCTY42U/NRZ_AU_YBwAJ Proposed solution ----------------- Delay argument evaluation in checks by wrapping each in a thunk, then force the thunks in the body of a check so that any expression passed to the check runs within a context that can be reported to the programmer.
Example code: (require rackunit)
(define (f)
(raise-result-error 'f 42 'none-given))
(check eq? (f) 42)
(check eq? 41 (+ 41 1)) Error report:
note the location for the ERROR case. |
This changes the "checks are functions" idea that the docs suggest. Currently, the only difference between:
and
is that the second one gives a different source location. With this PR, the second one is also going to lose the context. I'd rather we update the docs to talk about this gotcha with exceptions, and encourage people to write more & smaller tests. p.s. #97 |
Hey Ben. I understand your concern, but tbh not seeing what you're seeing. Have you tried running your example? ;; check in identifier position
(let ((f check-equal?))
(f 41 42)
(f 1 2)) => --------------------
; FAILURE
; /Users/russki/Code/fcgi-rkt/play.rkt:18:9
name: check-equal?
location: play.rkt:18:9
actual: 41
expected: 42
--------------------
--------------------
; FAILURE
; /Users/russki/Code/fcgi-rkt/play.rkt:18:9
name: check-equal?
location: play.rkt:18:9
actual: 1
expected: 2
-------------------- with or without PR. Location there is pointing at the I can think of only one corner case where check macro as id may "misbehave" due to this PR. If you define a custom check that expects thunks and then does something with those thunks inside the body. IIUC, the following line from the PR would inadvertently force them - probably not something the user intended: (set! formal (if (thunk? formal) (formal) formal)) I don't know how likely this corner case, probably not unlikely. Don't know. We can probably work around that by wrapping in a callable struct whose type we can identify, then check for that struct type rather than thunk. Easy fix, IMO. I'm not insisting on the proposed solution, but OMG the current behavior is infuriating, to the point that it discourages me to write or rely on tests. This brought to you by me trying to figure out which test in the source file triggers an exception by essentially commenting things out, moving code and tests and code around every single time that happens. Racket isn't like Clojure, Elisp or CL in that I can't just willy-nilly eval tests one by one until I discover which'd failed. Even if I could why do something machine can do better? Did I misunderstand your example above?
I don't think you can go more fine grained than |
I agree about the backward compatibility concerns here. But it would be
nice to have some that do capture exceptions. Can we find new names?
Robby
…On Sun, Apr 7, 2019 at 11:22 PM Ben Greenman ***@***.***> wrote:
This changes the "checks are functions" idea that the docs suggest.
Currently, the only difference between:
(check-equal? A B)
and
(let ((f check-equal?))
(f A B))
is that the second one gives a different source location. With this PR,
the second one is also going to lose the context.
I'd rather we update the docs to talk about this gotcha with exceptions,
and encourage people to write more & smaller tests.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#107 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAYWsDFS3Ewe5J1zeFvbgyiI98CLUlakks5vesP0gaJpZM4cg9cd>
.
|
Our experience with checking for beginners (test-engine/) shows that we want to deal with exceptions properly here. The idea of check-*s as functions might have looked appealing at some point, but I fail to see any in it. I think we should create rackunit2, have it import all of rackunit, and re-export macro-based checkers with the exact same names. Then we should deprecate rackunit eventually and point people to rackunit2. |
In rackunit, the check forms are meant to act like functions. The test forms are the things that wrap evaluation, catch errors and continue, etc. If you write
it catches the error, prints a message, and continues. So I recommend leaving checks alone and using the test forms (like One obstacle to using the test forms is the extra verbosity of giving every test a name. Here's a proposal: let's allow the identifier |
ATM the only indication you get from the TBH I get why people would hesitate about what I suggested. It solves a real problem, but then it is a hack and may break people upstream. Dunno, you guys make the call. I'll make do with my patch for now lest I go crazy. |
|
Maybe there's a less extreme version of this pull request that still achieves your goals of catching errors thrown by arguments, but doesn't need to change Within the output of the In other words it shouldn't be |
I second @mfelleisen's proposal to abandon the notion that |
The issue this PR was written to solve has been fixed by #109, so I am closing this PR |
Problem
Rackunit family of
check
macros, as they are implemented atm, exhibit anunfortunate behavior where they lose all context and report no location of failed
checks when the code being tested raises exceptions. Picture a typical rkt source
file with test-modules freely interspersed with code. In a check like this:
(check eq? (f) 42)
if f ever throws, the exception will be propagated to the programmer without much
detail and worse without any indication as to which check (perhaps of hundreds)
raised an error. Location is neither captured nor reported. Triaging such failures
quickly turns into a bisecting expedition that may be a lot of work when your have
many tests in the same file.
Here's the mechanics as I understand it. Rackunit checks are defined with
define-check
macro, which takes a bunch of arguments and a check body thatconstitutes the computation that decides whether the check succeeds or fails. The
macro constructs a function that performs said computation with the body wrapped
in additional context e.g. location and name of the check, so that installed
handlers can communicate the failure to the programmer. That usually works well,
however, since checks are really glorified functions they eval their arguments
eagerly. As it happens in your typical check most of the user code runs before it
ever reaches the body. During the evaluation of the arguments to the check
function no context is collected, nor does the branch that handles such errors
attempt to report anything beyond the error message (see
display-test-failure/error
in format.rkt).This issue has been discussed at some length on the mailing list:
https://groups.google.com/d/msg/racket-users/aCQwqCTY42U/NRZ_AU_YBwAJ
Proposed solution
Delay argument evaluation in checks by wrapping each in a thunk, then force the
thunks in the body of a check so that any expression passed to the check runs
within a context that can be reported to the programmer.
I believe this won't break your typical use case of rackunit and no user code
would have to change. This may not be true of any library that builds on top of
rackunit.