From da7285685370f1258f9559628306f8ac104ef8c7 Mon Sep 17 00:00:00 2001 From: Irit Katriel Date: Thu, 5 Dec 2024 18:20:01 +0000 Subject: [PATCH] tier 2 --> jit --- InternalDocs/README.md | 2 +- InternalDocs/{tier2.md => jit.md} | 74 ++++++++++++++++--------------- 2 files changed, 40 insertions(+), 36 deletions(-) rename InternalDocs/{tier2.md => jit.md} (68%) diff --git a/InternalDocs/README.md b/InternalDocs/README.md index 5dfd9a037921f4f..a66530abb765b36 100644 --- a/InternalDocs/README.md +++ b/InternalDocs/README.md @@ -35,7 +35,7 @@ Program Execution - [The Bytecode Interpreter](interpreter.md) -- [The Tier 2 Interpreter and JIT](tier2.md) +- [The JIT](jit.md) - [Garbage Collector Design](garbage_collector.md) diff --git a/InternalDocs/tier2.md b/InternalDocs/jit.md similarity index 68% rename from InternalDocs/tier2.md rename to InternalDocs/jit.md index fb42b91ddc067ca..6ac4a26a0e6965e 100644 --- a/InternalDocs/tier2.md +++ b/InternalDocs/jit.md @@ -1,18 +1,21 @@ -# The Tier 2 Interpreter - -The [basic interpreter](interpreter.md), also referred to as the `tier 1` -interpreter, consists of a main loop that executes the bytecode instructions -generated by the [bytecode compiler](compiler.md) and their -[specializations](interpreter.md#Specialization). Runtime optimization in tier 1 -can only be done for one instruction at a time. The `tier 2` interpreter is -based on a mechanism to replace an entire sequence of bytecode instructions, +# The JIT + +The [adaptive interpreter](interpreter.md) consists of a main loop that +executes the bytecode instructions generated by the +[bytecode compiler](compiler.md) and their +[specializations](interpreter.md#Specialization). Runtime optimization in +this interpreter can only be done for one instruction at a time. The JIT +is based on a mechanism to replace an entire sequence of bytecode instructions, and this enables optimizations that span multiple instructions. +Historically, the adaptive interpreter was referred to as `tier 1` and +the JIT as `tier 2`. You will see remnants of this in the code. + ## The Optimizer and Executors -The program begins running in tier 1, until a `JUMP_BACKWARD` instruction -determines that it is `hot` because the counter in its -[inline cache](interpreter.md#inline-cache-entries) indicates that is +The program begins running on the adaptive interpreter, until a `JUMP_BACKWARD` +instruction determines that it is "hot" because the counter in its +[inline cache](interpreter.md#inline-cache-entries) indicates that it executed more than some threshold number of times (see [`backoff_counter_triggers`](../Include/internal/pycore_backoff.h)). It then calls the function `_PyOptimizer_Optimize()` in @@ -23,8 +26,9 @@ constructs an object of type an optimized version of the instruction trace beginning at this jump. The optimizer determines where the trace ends, and the executor is set up -to either return to `tier 1` and resume execution, or transfer control -to another executor (see `_PyExitData` in Include/internal/pycore_optimizer.h). +to either return to the adaptive interpreter and resume execution, or +transfer control to another executor (see `_PyExitData` in +Include/internal/pycore_optimizer.h). The executor is stored on the [`code object`](code_objects.md) of the frame, in the `co_executors` field which is an array of executors. The start @@ -32,17 +36,17 @@ instruction of the trace (the `JUMP_BACKWARD`) is replaced by an `ENTER_EXECUTOR` instruction whose `oparg` is equal to the index of the executor in `co_executors`. -## The uop optimizer +## The micro-op optimizer -The optimizer that `_PyOptimizer_Optimize()` runs is configurable -via the `_Py_SetTier2Optimizer()` function (this is used in test -via `_testinternalcapi.set_optimizer()`.) +The optimizer that `_PyOptimizer_Optimize()` runs is configurable via the +`_Py_SetTier2Optimizer()` function (this is used in test via +`_testinternalcapi.set_optimizer()`.) -The tier 2 optimizer, `_PyUOpOptimizer_Type`, is defined in -[`Python/optimizer.c`](../Python/optimizer.c). It translates -an instruction trace into a sequence of micro-ops by replacing -each bytecode by an equivalent sequence of micro-ops -(see `_PyOpcode_macro_expansion` in +The micro-op optimizer (abbreviated `uop` to approximate `μop`) is defined in +[`Python/optimizer.c`](../Python/optimizer.c) as the type `_PyUOpOptimizer_Type`. +It translates an instruction trace into a sequence of micro-ops by replacing +each bytecode by an equivalent sequence of micro-ops (see +`_PyOpcode_macro_expansion` in [pycore_opcode_metadata.h](../Include/internal/pycore_opcode_metadata.h) which is generated from [`Python/bytecodes.c`](../Python/bytecodes.c)). The micro-op sequence is then optimized by @@ -50,13 +54,13 @@ The micro-op sequence is then optimized by [`Python/optimizer_analysis.c`](../Python/optimizer_analysis.c) and a `_PyUOpExecutor_Type` is created to contain it. -## Running a uop executor on the tier 2 interpreter +## Debugging a uop executor in the JIT interpreter -After a tier 1 `JUMP_BACKWARD` instruction invokes the uop optimizer -to create a tier 2 uop executor, it transfers control to this executor -via the `GOTO_TIER_TWO` macro. +After a `JUMP_BACKWARD` instruction invokes the uop optimizer to create a uop +executor, it transfers control to this executor via the `GOTO_TIER_TWO` macro. -When tier 2 is enabled but the JIT is not (python was configured with +When the JIT is configured to run on its interpreter (i.e., python is +configured with [`--enable-experimental-jit=interpreter`](https://docs.python.org/dev/using/configure.html#cmdoption-enable-experimental-jit)), the executor jumps to `tier2_dispatch:` in [`Python/ceval.c`](../Python/ceval.c), where there is a loop that @@ -67,19 +71,19 @@ which is generated by the build script from the bytecode definitions in [`Python/bytecodes.c`](../Python/bytecodes.c). This loop exits when an `_EXIT_TRACE` or `_DEOPT` uop is reached, -and execution returns to teh tier 1 interpreter. +and execution returns to the adaptive interpreter. ## Invalidating Executors In addition to being stored on the code object, each executor is also -inserted into a list of all executors which is stored in the interpreter +inserted into a list of all executors, which is stored in the interpreter state's `executor_list_head` field. This list is used when it is necessary -to invalidate executors because values that their construction depended -on may have changed. +to invalidate executors because values they used in their construction may +have changed. ## The JIT -When the jit is enabled (python was configured with +When the full jit is enabled (python was configured with [`--enable-experimental-jit`](https://docs.python.org/dev/using/configure.html#cmdoption-enable-experimental-jit), the uop executor's `jit_code` field is populated with a pointer to a compiled C function that implement the executor logic. This function's signature is @@ -89,7 +93,7 @@ the uop interpreter at `tier2_dispatch`, the executor runs the function that `jit_code` points to. This function returns the instruction pointer of the next Tier 1 instruction that needs to execute. -The generation of the jitted fuctions uses the copy-and-patch technique +The generation of the jitted functions uses the copy-and-patch technique which is described in [Haoran Xu's article](https://sillycross.github.io/2023/05/12/2023-05-12/). At its core are statically generated `stencils` for the implementation @@ -113,8 +117,8 @@ functions are used to generate the file that the JIT can use to emit code for each of the bytecodes. For Python maintainers this means that changes to the bytecodes and -their implementations do not require changes related to the JIT, -because everything the JIT needs is automatically generated from +their implementations do not require changes related to the stencils, +because everything is automatically generated from [`Python/bytecodes.c`](../Python/bytecodes.c) at build time. See Also: