Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat exceptions thrown by arguments to check as test failures #107

Closed
wants to merge 1 commit into from

Conversation

vkz
Copy link

@vkz vkz commented Apr 7, 2019

Problem

Rackunit family of check macros, as they are implemented atm, exhibit an
unfortunate behavior where they lose all context and report no location of failed
checks when the code being tested raises exceptions. Picture a typical rkt source
file with test-modules freely interspersed with code. In a check like this:

(check eq? (f) 42)

if f ever throws, the exception will be propagated to the programmer without much
detail and worse without any indication as to which check (perhaps of hundreds)
raised an error. Location is neither captured nor reported. Triaging such failures
quickly turns into a bisecting expedition that may be a lot of work when your have
many tests in the same file.

Here's the mechanics as I understand it. Rackunit checks are defined with
define-check macro, which takes a bunch of arguments and a check body that
constitutes the computation that decides whether the check succeeds or fails. The
macro constructs a function that performs said computation with the body wrapped
in additional context e.g. location and name of the check, so that installed
handlers can communicate the failure to the programmer. That usually works well,
however, since checks are really glorified functions they eval their arguments
eagerly. As it happens in your typical check most of the user code runs before it
ever reaches the body. During the evaluation of the arguments to the check
function no context is collected, nor does the branch that handles such errors
attempt to report anything beyond the error message (see
display-test-failure/error in format.rkt).

This issue has been discussed at some length on the mailing list:
https://groups.google.com/d/msg/racket-users/aCQwqCTY42U/NRZ_AU_YBwAJ

Proposed solution

Delay argument evaluation in checks by wrapping each in a thunk, then force the
thunks in the body of a check so that any expression passed to the check runs
within a context that can be reported to the programmer.

I believe this won't break your typical use case of rackunit and no user code
would have to change. This may not be true of any library that builds on top of
rackunit.

Problem
-------

Rackunit family of /check/ macros, as they are implemented atm, excibit an
unfortunate behavior where they lose all context and report no location of failed
checks when the code being tested raises exceptions. Picture a typical rkt source
file with test-modules freely interspersed with code. In a check like this:

  (check eq? (f) 42)

if f ever throws the exception will be propagated to the programmer without much
detail and worse without any indication as to which check (perhaps of hundreds)
raised an error. Location is neither captured nor reported. Triaging such failures
quickly turns into a bisecting expedition that may be a lot of work when your have
many tests in the same file.

Here's the mechanics as I understand it. Rackunit checks are defined with
`define-check` macro, which takes a bunch of arguments and a check body that
constitutes the computation that decides whether the check succeeds or fails. The
macro constructs a function that performs said computation with the body wrapped
in additional context e.g. location and name of the check, so that installed
handlers can communicate the failure to the programmer. That usually works well,
however, since checks are really glorified functions they eval their arguments
eagerly. As it happens in your typical check most of the user code runs before it
ever reaches the body. During the evaluation of the arguments to the check
function no context is collected, nor does the branch that handles such errors
attempt to report anything beyond the error message (see
`display-test-failure/error` in format.rkt).

This issue has been discussed at some length on the mailing list:
https://groups.google.com/d/msg/racket-users/aCQwqCTY42U/NRZ_AU_YBwAJ

Proposed solution
-----------------

Delay argument evaluation in checks by wrapping each in a thunk, then force the
thunks in the body of a check so that any expression passed to the check runs
within a context that can be reported to the programmer.
@vkz
Copy link
Author

vkz commented Apr 7, 2019

Example code:

(require rackunit)

(define (f)
  (raise-result-error 'f 42 'none-given))

(check eq? (f) 42)
(check eq? 41 (+ 41 1))

Error report:

--------------------
; ERROR
name:       check
location:   play.rkt:12:0
params:     '(check eq? (f) 42)

raise-result-error: contract violation
  expected: string?
  given: 42
  argument position: 2nd
  other arguments...:
   'f
   'none-given
--------------------
--------------------
; FAILURE
; /Users/russki/Code/fcgi-rkt/play.rkt:14:0
name:       check
location:   play.rkt:14:0
params:     '(check eq? 41 (+ 41 1))
--------------------

note the location for the ERROR case.

@bennn
Copy link
Contributor

bennn commented Apr 8, 2019

This changes the "checks are functions" idea that the docs suggest.

Currently, the only difference between:

(check-equal? A B)

and

(let ((f check-equal?))
  (f A B))

is that the second one gives a different source location. With this PR, the second one is also going to lose the context.

I'd rather we update the docs to talk about this gotcha with exceptions, and encourage people to write more & smaller tests.

p.s. #97

@vkz
Copy link
Author

vkz commented Apr 8, 2019

Hey Ben.

I understand your concern, but tbh not seeing what you're seeing. Have you tried running your example?

;; check in identifier position
(let ((f check-equal?))
  (f 41 42)
  (f 1 2))

=>

--------------------
; FAILURE
; /Users/russki/Code/fcgi-rkt/play.rkt:18:9
name:       check-equal?
location:   play.rkt:18:9
actual:     41
expected:   42
--------------------
--------------------
; FAILURE
; /Users/russki/Code/fcgi-rkt/play.rkt:18:9
name:       check-equal?
location:   play.rkt:18:9
actual:     1
expected:   2
--------------------

with or without PR. Location there is pointing at the (f check-equal?) bit. I get the idea that you'd rather have it point at the use site, which is in the body of let, but it didn't even before this PR. In fact, unless I misunderstand the changes I made, the case with checks in macro id position (that's what allows you to "pass them as function values" should behave exactly as they used, that's because (a) I never touched that part of code and (b) macro id will have been expanded by the time the result is used, so the args remain unwrapped and the whole thing behaves exactly as it used to.

I can think of only one corner case where check macro as id may "misbehave" due to this PR. If you define a custom check that expects thunks and then does something with those thunks inside the body. IIUC, the following line from the PR would inadvertently force them - probably not something the user intended:

(set! formal (if (thunk? formal) (formal) formal))

I don't know how likely this corner case, probably not unlikely. Don't know. We can probably work around that by wrapping in a callable struct whose type we can identify, then check for that struct type rather than thunk. Easy fix, IMO.

I'm not insisting on the proposed solution, but OMG the current behavior is infuriating, to the point that it discourages me to write or rely on tests. This brought to you by me trying to figure out which test in the source file triggers an exception by essentially commenting things out, moving code and tests and code around every single time that happens. Racket isn't like Clojure, Elisp or CL in that I can't just willy-nilly eval tests one by one until I discover which'd failed. Even if I could why do something machine can do better?

Did I misunderstand your example above?

I'd rather we update the docs to talk about this gotcha with exceptions, and encourage people to write more & smaller tests.

I don't think you can go more fine grained than (check eq? (f) value) and it would still exhibit the same problem. How would smaller tests help?

@rfindler
Copy link
Member

rfindler commented Apr 8, 2019 via email

@mfelleisen
Copy link

Our experience with checking for beginners (test-engine/) shows that we want to deal with exceptions properly here. The idea of check-*s as functions might have looked appealing at some point, but I fail to see any in it.

I think we should create rackunit2, have it import all of rackunit, and re-export macro-based checkers with the exact same names. Then we should deprecate rackunit eventually and point people to rackunit2.

@rmculpepper
Copy link
Contributor

In rackunit, the check forms are meant to act like functions. The test forms are the things that wrap evaluation, catch errors and continue, etc. If you write

(test-equal? "anon" 1 (/ 0))

it catches the error, prints a message, and continues. So I recommend leaving checks alone and using the test forms (like test-equal?) instead.

One obstacle to using the test forms is the extra verbosity of giving every test a name. Here's a proposal: let's allow the identifier _ in the test name position, in which case the test macro synthesizes a name like "anonymous test at file:line".

@vkz
Copy link
Author

vkz commented Apr 8, 2019

it catches the error, prints a message, and continues. So I recommend leaving checks alone and using the test forms (like test-equal?) instead.

ATM the only indication you get from the (test-equal? "anon" 1 (/ 0)) example you suggested is the anon name - no location given. Definitely an improvement over check. With anonymous tests you'd absolutely want to synthesize that location else you're back to the problem we have with checks. But then why not do it always, anonymous or not? Of course "tests are functions" make it problematic.

TBH I get why people would hesitate about what I suggested. It solves a real problem, but then it is a hack and may break people upstream. Dunno, you guys make the call. I'll make do with my patch for now lest I go crazy.

@bennn
Copy link
Contributor

bennn commented Apr 8, 2019

Hey Ben.

I understand your concern, but tbh not seeing what you're seeing. Have you tried running your example?

f is not going to catch exceptions as well as check-equal?

@AlexKnauth
Copy link
Member

Maybe there's a less extreme version of this pull request that still achieves your goals of catching errors thrown by arguments, but doesn't need to change make-check-func and change the behavior of checks applied to thunks.

Within the output of the define-check macro, the check-impl and make-check-func should stay the same, but the define-syntax name can still change to put the argument evaluation into a context that creates a test failure, before calling the check-impl function.

In other words it shouldn't be check-impl / make-check-func's job to catch argument-evaluation failures since those should be functions that deal with already-evaluated values. But it can be the define-syntax name's job.

@jackfirth
Copy link
Collaborator

I second @mfelleisen's proposal to abandon the notion that check forms are just functions. It places far too many restrictions on the check API, and these restrictions make it difficult to provide important quality-of-life features like this pull request and #73. I think rackunit2 is a good idea.

@AlexKnauth
Copy link
Member

The issue this PR was written to solve has been fixed by #109, so I am closing this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants