Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid IR traversal by converting (some of) our Passes to mini passes #10981

Closed
JaroslavTulach opened this issue Sep 5, 2024 · 14 comments
Closed

Comments

@JaroslavTulach
Copy link
Member

JaroslavTulach commented Sep 5, 2024

After discussing this idea @hubertp pointed out there is a prior work - read at Miniphases: compilation using modular and efficient tree transformations.

Current State

Enso static compiler currently works on IR tree. The work is performed by many IR passes. Each Pass gets IR tree as an input and is supposed to return copy of the IR tree as output. This is very inefficient:

  • most of the passes aren't interested in 99% of the IR tree - yet they have to traverse it themselves
  • a copy of IR tree escapes (from a point of escape analysis) - e.g. it has to be allocated on the heap - as it is too large/heterogenous to optimize by any compiler
  • by allocating copies of the whole IR on the heap we create unnecessary pressure on the GC

Goal

Rather than passing the whole IR as data and letting each Pass to implement the traversal itself, let's:

  • rewrite eligible passes to simple PartialFunction<IR, IR>
  • chain these passes into huge mega PartialFunction<IR, IR>
  • create a general pass to traverse the IR (leafs first) and apply the mega partial function on each IR element

By doing this we:

  • eliminate unnecessary traversal - the whole IR tree is going to be scanned just once for the mega partial function
  • give escape analysis a chance to virtualize object allocation inside of the mega partial function lowering the need to allocate objects on heap

Impact

A lot of Passes may be eligible for the mini passes approach. Usually they just invoke transformExpression with a single partial function - all of them are eligible for rewrite:

Pre-requisite

We need generic traversal function. Right now there is ir.Expression.transformExpression, but we need such a transform function for every IR element. See:

Update: At the end #11191 uses IR.mapExpressions to traverse the tree and things seem to work fine.

@github-project-automation github-project-automation bot moved this to ❓New in Issues Board Sep 5, 2024
@JaroslavTulach JaroslavTulach changed the title Avoid IR traversal by converting (some of) our Passes to _mini passes_ Avoid IR traversal by converting (some of) our Passes to mini passes Sep 6, 2024
@JaroslavTulach
Copy link
Member Author

JaroslavTulach commented Sep 25, 2024

When at it please design the traversing architecture to be ready for parallelization by fork join pool.

@enso-bot
Copy link

enso-bot bot commented Sep 26, 2024

Pavel Marek reports a new STANDUP for today (2024-09-26):

Progress: - Skeleton of mini pass framework implementation in #11191

@enso-bot
Copy link

enso-bot bot commented Sep 27, 2024

Pavel Marek reports a new STANDUP for today (2024-09-27):

Progress: - Migrating LambdaToShorthandLambda to mini IR pass.

  • Many tests already pass.
  • The pass is mostly a copy of the mega pass, without the need for IR traversal.
  • Must use MiniIRPass.prepare to set up the pass for some IR elements. It should be finished by 2024-10-02.

@enso-bot
Copy link

enso-bot bot commented Sep 30, 2024

Pavel Marek reports a new STANDUP for today (2024-09-30):

Progress: - Fix for unable to boot enso on MacOS - https://github.com/enso-org/enso/actions/runs/11104391535/job/30848357375?pr=11212

GitHub
Hybrid visual and textual functional programming. Contribute to enso-org/enso development by creating an account on GitHub.

@enso-bot
Copy link

enso-bot bot commented Oct 1, 2024

Pavel Marek reports a new STANDUP for today (2024-10-01):

Progress: - Migrating TailCall to mini pass.

  • This is interesting because it requires to provide parentIr in the prepare method. It should be finished by 2024-10-02.

@enso-bot
Copy link

enso-bot bot commented Oct 2, 2024

Pavel Marek reports a new STANDUP for today (2024-10-02):

Progress: - Remove parentIr from the prepare method.

  • TailCallMini now collect a set of IR nodes that are definitely tail calls during the preparation phase.
  • Expression.withNewChildren cannot be implemented via mapExpression.
    • Application.Prefix.mapExpressions is recursive! It should be finished by 2024-10-02.

@JaroslavTulach
Copy link
Member Author

JaroslavTulach commented Oct 3, 2024

Hello Pavel, I have a problem understanding following part of your standup:

* `Expression.withNewChildren` cannot be implemented via `mapExpression`.
  * `Application.Prefix.mapExpressions` is recursive!

As far as I see, there is no recursion in Application.Prefix.mapExpressions. Here is my example:

enso.master$ git diff
diff --git engine/runtime-parser/src/test/java/org/enso/compiler/core/EnsoParserTest.java engine/runtime-parser/src/test/java/org/enso/compiler/core/EnsoParserTest.java
index 256d415952..2596841789 100644
--- engine/runtime-parser/src/test/java/org/enso/compiler/core/EnsoParserTest.java
+++ engine/runtime-parser/src/test/java/org/enso/compiler/core/EnsoParserTest.java
@@ -37,6 +37,27 @@ public class EnsoParserTest {
     """, true, true, true);
   }
 
+  @Test
+  public void testApplicationShallowTraversal() {
+    var ir = compile("""
+    fn x y z = x ((y 2) (z 1))
+    """);
+    java.util.function.BiConsumer<String, IR> log =
+        (prefix, fst) -> {
+          System.out.println(prefix + fst.getClass().getSimpleName() + ":" + fst.showCode());
+        };
+    ir.mapExpressions(
+        (fst) -> {
+          log.accept("first level: ", fst);
+          fst.mapExpressions(
+              (snd) -> {
+                log.accept("  second level: ", snd);
+                return snd;
+              });
+          return fst;
+        });
+  }
+
   @Test
   public void testLocationsApplicationsAndMethodCalls() {
     parseTest("""

it runs and prints:

first level: Prefix:((x) ((((y) 2)) ((z) 1)))
  second level: Literal:x
  second level: Prefix:((((y) 2)) ((z) 1))

e.g. only first and second level is iterated. Not the whole tree composed of Application.Prefix elements. If you really believe some implementation of mapExpressions is recursive - can you please support such claim with an example that demonstrates the recursiveness?

Expression.withNewChildren cannot be implemented via mapExpression.

Is this claim based on the previous (not proved) one? Or is there some other reason why it doesn't seem possible?

@Akirathan
Copy link
Member

Is this claim based on the previous (not proved) one? Or is there some other reason why it doesn't seem possible?

@JaroslavTulach Sorry for not being precise. Let's look closely at your example.

Put

  "Deep traversal" should {
    "XX work" in {
      implicit val ctx: ModuleContext = mkModuleContext

      def log(prefix: String, fst: IR) = {
        println(prefix + fst.getClass.getSimpleName + ":" + fst.showCode())
      }

      val ir =
        """
          |fn x y z = x ((y 2) (z 1))
          |""".stripMargin.preprocessModule.analyse
      // prefix is `x ((y 2) (z 1))` expression
      val prefix = ir.bindings(0)
        .asInstanceOf[Method.Explicit]
        .body
        .asInstanceOf[Function.Lambda]
        .body
        .asInstanceOf[Application.Prefix]
      prefix.function.asInstanceOf[Name.Literal].name shouldEqual "x"
      prefix.arguments should have size 1
      val arg = prefix.arguments(0).asInstanceOf[CallArgument.Specified]
      arg.value.isInstanceOf[Application.Prefix] shouldBe true
      prefix.mapExpressions { fst =>
        log("first level: ", fst)
        fst.mapExpressions { snd =>
          log("  second level: ", snd)
          snd
        }
      }
    }
  }

At the end of TailCallTest.scala. The IR for prefix looks like this:
prefix

As you can see, the prefix direct children is Literal and CallArgument.Specified. But CallArgument.Specified is skipped in mapExpressions:

first level: Literal:x
first level: Prefix:((((y) 2)) ((z) 1))
  second level: Prefix:((y) 2)
  second level: Prefix:((z) 1)

The relevant code is arguments.map(_.mapExpressions(fn)) in Application.Prefix.mapExpressions

TL;DR; We cannot rely on mapExpressions in withNewChildren. It may not be recursive, but it skips some IR elements and maps the function on their children (Prefix skips CallArgument.Specified and calls fn on their values).

@JaroslavTulach
Copy link
Member Author

OK, that's indeed a different problem than being recursive:

We cannot rely on mapExpressions in withNewChildren. ... it skips some IR elements and maps the function on their children (Prefix skips CallArgument.Specified and calls fn on their values).

Yes, obviously mapExpression deals only with IR.Expression and not with the rest of children. But maybe we want to traverse just Expression as stated here.

@enso-bot
Copy link

enso-bot bot commented Oct 6, 2024

Pavel Marek reports a new STANDUP for the provided date (2024-10-04):

Progress: - More discussion about the design of the new API. Playing with some ideas.

@enso-bot
Copy link

enso-bot bot commented Oct 9, 2024

Pavel Marek reports a new STANDUP for today (2024-10-09):

Progress: - Moved the old mega passes to tests

  • Refactored TailCallTest to test same stuff both on mega pass and mini pass.
  • Struggling with fixing tests with the new traversal based on mapExpressions. It should be finished by 2024-10-15.

@enso-bot
Copy link

enso-bot bot commented Oct 10, 2024

Pavel Marek reports a new STANDUP for today (2024-10-10):

Progress: - Working with traversal with only mapExpression is very difficult.

@enso-bot
Copy link

enso-bot bot commented Oct 11, 2024

Pavel Marek reports a new STANDUP for today (2024-10-11):

Progress: - Further discussing about the mini pass IR traversal - letting Jaroslav implement tail call based on mapExpression traversal.

@JaroslavTulach JaroslavTulach moved this from ⚙️ Design to 👁️ Code review in Issues Board Oct 15, 2024
@JaroslavTulach JaroslavTulach self-assigned this Oct 15, 2024
mergify bot pushed a commit that referenced this issue Oct 15, 2024
Gets ready for avoiding IR traversal by introducing _mini passes_ as proposed by #10981:
- creates [MiniPassFactory](762045a) (that extends common `IRProcessingPass`) to transform an `IR` element to another `IR` element
- modifies `PassManager` to recognize such _mini passes_ and treat them in a special way - by using `MiniIRPass.compile`
- `MiniIRPass.compile` is using `IR.mapExpressions` to traverse the `IR` - alternative approach [withNewChildren](1abc70d) rejected for now, see _future work_ for details
- unlike _mega passes_ `IRMiniPass.compile` **does not  recursively** traverse, but with 0964711 it invokes each _mini pass_ at constant stack depth - way better for profiling
- `MiniIRPass.prepare` _works on edges_ since ffd27df - there is `IRMiniPass prepare(parent, child)` to collect information while pre-order traversing from a particular `IR` parent to a particular `IR` child
- `PassManager` rewritten to group _subsequent mini passes_ together by `MiniIRPass.combine` and really and traverse the `IR` just once - done in 2736a76
- converted to _mini pass_: `LambdaShorthandToLambda`, `OperatorToFunction`, `SectionsToBinOp` and `TailCall`
- tested for 1:1 compatibility by [converting original code to test code](f54ba6d) and _comparing `IR` produced by old and new_ implementations
@Akirathan
Copy link
Member

Closed by #11191. Follow-up is #11326.

@github-project-automation github-project-automation bot moved this from 👁️ Code review to 🟢 Accepted in Issues Board Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

2 participants