Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mirgen: lower finally #1468

Merged
merged 18 commits into from
Nov 15, 2024
Merged

Conversation

zerbina
Copy link
Collaborator

@zerbina zerbina commented Nov 8, 2024

Summary

  • simplify MIR control-flow constructs
  • lower finally into a block + case dispatcher in mirgen
  • emit exception stack management code in mirgen already
  • inline scope cleanup directly at break and return

Details

The goals are to:

  • use a unified exception runtime across all backends
  • make the MIR simpler
  • reduce the complexity of the language reaching the code generators,
    and thus the amount of translation work for the code generators

For the MIR:

  • target lists are removed; jumps only need to specify a single target
  • Finally can only be used as the target for exceptional jumps;
    unstructured control-flow out of a Finally section is disallowed
  • only a single exceptional target can be specified for Continue
  • Raise is decoupled from exception management; it only initiates
    unwinding now

The various MIR processing, rendering, and translation is adjusted
accordingly. The CGIR equivalents to the MIR constructs change in the
same way.

Since Finally can no longer be used with non-exceptional control-flow
(Goto, Case, etc.), translation of both scope cleanup and finally
needs to change in mirgen. The exception runtime calls for managing
the exception stack also have to be injected in mirgen already.

finally Lowering

finally is translated into continuation passing. However, instead of
reifying the finally into a standalone procedure, all "invocations"
of (read, jumps to) the finally record their original destination in
a local variable, which a dispatcher emitted at the end of the
finally clause then uses to jump to the target (or another
intercepting finally).

Scope Cleanup

Scope cleanup in response to an exception being raised still uses
Finally. For cleanup in response to break or return, all
necessary cleanup is emitted directly before the Goto. This increases
the workload for the move analysis / destructor elision pass, but it
also results in more efficient code (since there are usually less
dynamic invocations of destructors).

Code Generation

All three code generators are updated to consider the new syntax and
semantics of Finally, Continue, etc. Notably:

  • ccgflow is obsolete and thus removed; code generation for
    Finally, Continue, etc. is now simple enough to implement
    directly in cgen
  • jsgen now translates Finally to a JavaScript catch clause (so
    as to not interfere with breaks), which simplifies code generation
    (no more enabling/disabling of finally clauses for breaks) and
    also improves code size / performance in some cases

Exception Runtime

  • nimAbortException has a viaRaise parameter now, as the C-specific
    error flag cannot be used anymore for detecting what causes the abort
  • prepareException is merged back into raiseExceptionAux

The JavaScript target also using the C exception runtime now fixes a
bug where breaking out of a finally didn't clean up the current
exception properly.

VM

Exception stack management is partially decoupled from the core VM and
moved to vmops (which hooks and implements the various exception
runtime procedures). Generating the stacktrace still needs to be done
directly by the VM, which prevents the exception stack management from
being fully decoupled.

Instead of leaving the implementation of `finally` and the exception
stack to the code generators (or a theoretical MIR pass), `mirgen` now
does the lowering itself.

This means turning `finally` into what is effectively a
block + try/except + dispatcher, which is the same thing that `ccgflow`
already did (just with access to the original structure, which makes
the translation much simpler).

Emitting the calls to the exception runtime is also moved to `mirgen`.
In order to be able to continue raising an in-flight instruction (a
re-raise statement can only raise a *caught* exception), the
`mResumeRaising` magic is introduced. `mnkRaise` should eventually
become low-level enough for the magic is no longer necessary.
@zerbina zerbina added the refactor Implementation refactor label Nov 8, 2024
@zerbina zerbina added this to the MIR phase milestone Nov 8, 2024
The selector variable was of type `int32`, but most values were typed
as `uint32`. The type of the selector is changed to `uint32`, which was
the intended type from the start.
* remove `Leave` and `TargetList` and adjust the grammar accordingly
* `Continue` only has a single jump target operand now
There are no more target lists, and the processing has to be adjusted
accordingly.
The error mode flag is now never set on entry of `nimAbortException`,
thus a dedicated parameter is required to tell the procedure whether
the current or previous active exception needs to be popped.
* `ccgflow` is obsolete and thus removed
* all control-flow constructs can now be translated directly, without
  any extra IRs
`cnkTargetList` and `cnkLeave` are no longer needed.
The complex handling of `finally` is obsolete, as a `cnkFinally` can
now only intercept exceptions.

All finallys as translated to JavaScript `catch` clauses, so as to not
interfere with `break`. As a consequence, the enabling/disabling of
finally sections prior to jumps becomes unnecessary and is removed.
In order to remove `mResumeRaising`, which had the problem of acting as
a "fork" instead of a "goto", the meaning of `mnkRaise` is changed to
"jump to the specified exception handler", removing the
"push exception" part. This allows using `mnkRaise` instead of
`mResumeRaising`, rendering the latter obsolete.

`mirgen` emits the calls to the relevant exception runtime procedures,
and the code generators and MIR processing are adjusted to the new
meaning of `mnkRaise`.

The `raiseExceptionEx` runtime procedure is restored, with
`prepareException` merged back into it. In addition, the runtime
procedures no longer toggle the error flag.
While the `.up` field of the exception could be used - like how the C
implementation does it -, this would be unnecessarily convoluted.
Instead, the implementation simply uses a `seq`-based stack, which is
currently possible due to `seq` mapping to JavaScript arrays.

`lastJSError` is still needed for `jsgen` raise implementation, so it's
kept.
* target lists and goto-intercepting-finallys are gone, making code
  generation much easier
* instead of adding and later patching an `ehoNext` instruction when
  associating an EH instruction, the EH table entry is patched,
  reducing the amount of EH code
*
* remove the `opcLeave` and `opcEnter` opcodes; they're obsolete
* remove the `ehoLeave` opcode; it's obsolete
* decouple the exception stack from the EH threads
* implement the exception stack management runtime procedures for the
  VM
The test now runs correctly with the JavaScript backend.
@zerbina
Copy link
Collaborator Author

zerbina commented Nov 9, 2024

A MIR Finally is only usable for intercepting exceptional control-flow now, and thus it cannot be used for destructor calls anymore. For implementing scope cleanup, one possible solution would have been to keep the merged destructor blocks and use dispatchers (which is what ccgflow previously did, in the end), but I opted for inlining the destructors directly at the gotos instead.

To visualize what I mean, consider the following:

block L1:
  var x = ...
  if ...: break L1
  var y = ...
  if ...: break L1

Previously, this resulted in MIR code that is equivalent to:

block L1:
  var x = ...
  try:
    if ...: break L1
    var y = ...
    try:
      if ...: break L1
    finally:
      `=destroy`(y)
  finally:
    `=destroy`(x)

Now, the generated MIR code is equivalent to:

block L1:
  var x = ...
  if ...:
    `=destroy`(x)
    break L1
  var y = ...
  if ...:
    `=destroy`(y)
    `=destroy`(x)
    break L1
  `=destroy`(y)
  `=destroy`(x)

While both are functionally equivalent, the inlined version allows for more static elision of =destroy calls and destructive moves (wasMoved), thus possibly reducing run-time overhead.


I made a test with the compiler code-base in order to get some concrete performance numbers. Let A be the compiler from fb4665a (devel, at the time of writing). Let B be the compiler from this PR.

Using a -d:release version of A that was built by A, the time it takes to compile A is:

Time (mean ± σ):     14.955 s ±  0.156 s

Using a -d:release version of A that was built by B, the time it takes to compile A is:

Time (mean ± σ):     14.641 s ±  0.047 s

@zerbina zerbina changed the title mirgen: lower finally in mirgen mirgen: lower finally Nov 9, 2024
@zerbina zerbina added runtime Interface or implementation of the runtime compiler/backend Related to backend system of the compiler bug Something isn't working labels Nov 9, 2024
@zerbina zerbina marked this pull request as ready for review November 9, 2024 23:11
Copy link
Collaborator

@saem saem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one design question, but looks good

Comment on lines +290 to +291
At the end of `Finally` section must be a `Continue` statement, which specifies
where raising the exception continues.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silly question, would it be possible to specify this upfront as part of the finally block as Finally <label> <EX_TARGET>, seems like one less thing to remember?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a silly question, and it's also something that I had considered.

The reason I decided against it is that in general, processing needs to know the follow-up target upon visiting the mnkContinue and not when visiting the mnkFinally. Good examples for this are mirexec (generating a data-flow graph from the MIR code), cgen, and jsgen. Having the follow-up target on the Finally statement would require some additional state in those cases.

Co-authored-by: Saem Ghani <[email protected]>
@zerbina
Copy link
Collaborator Author

zerbina commented Nov 15, 2024

/merge

Copy link

Merge requested by: @zerbina

Contents after the first section break of the PR description has been removed and preserved below:


Note For Reviewers

  • reviewing the commits individually is likely going to be easier than reviewing everything together

@chore-runner chore-runner bot added this pull request to the merge queue Nov 15, 2024
Merged via the queue into nim-works:devel with commit 42945cd Nov 15, 2024
35 checks passed
@zerbina zerbina deleted the lower-finally-in-mirgen branch November 15, 2024 23:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working compiler/backend Related to backend system of the compiler refactor Implementation refactor runtime Interface or implementation of the runtime
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants