Skip to content

Commit

Permalink
Improve introduction.
Browse files Browse the repository at this point in the history
  • Loading branch information
Gohla committed Dec 22, 2023
1 parent eafa8b2 commit 4a60ded
Show file tree
Hide file tree
Showing 5 changed files with 75 additions and 65 deletions.
4 changes: 2 additions & 2 deletions src/0_intro/1_setup/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ In this tutorial, we will create a subset of the [PIE in Rust](https://github.co
However, later on in the tutorial we will also create an additional package (for unit testing utilities), so we need to set up a Rust _workspace_ that supports multiple packages.

Therefore, first create a `pibs` directory, which will serve as the workspace directory of the project.
This does not have to be called `pibs`, you can use a different name.
This does not have to be called `pibs`, you can use a different name, but this tutorial will use `pibs`.
Then create the `pibs/Cargo.toml` file with the following contents:

```toml,
Expand Down Expand Up @@ -96,7 +96,7 @@ We use [Rust edition 2021](https://doc.rust-lang.org/edition-guide/rust-2021/ind
I recommend storing your code in a source control system such as [Git](https://git-scm.com/), and uploading it to a source code hub such as [GitHub](https://github.com/).
A source control system allows you to look at changes and to go back to older versions, and uploading to a source code hub then provides a convenient backup.

If you use Git, create the `pie/.gitignore` file with:
If you use Git, create the `.gitignore` file with:

```.gitignore
/target
Expand Down
124 changes: 67 additions & 57 deletions src/0_intro/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,6 @@ This is of course not a full tutorial or book on Rust.
For that, I can recommend the excellent [The Rust Programming Language](https://doc.rust-lang.org/book/) book.
However, if you like to learn through examples and experimentation, or already know Rust basics and want to practice, this might be a fun programming tutorial for you!

[//]: # ()
[//]: # (Another secondary goal is to show what I think are several good software writing practices, such as dividing code into modules, thinking about what to expose as API, writing unit and integration tests, etc.)

[//]: # (Where possible I will try to explain design decisions, discuss tradeoffs, or provide more info about optimizations.)

We will first motivate programmatic incremental build systems in more detail.

## Motivation
Expand All @@ -29,11 +24,6 @@ A programmatic incremental build system is a mix between an incremental build sy
- _Correct_: Builds are fully correct -- all parts of the build that are affected by changes are executed. Builds are free of glitches: only up-to-date (consistent) data is observed.
- _Automatic_: The system takes care of incrementality and correctness. Programmers _do not_ have to manually implement incrementality. Instead, they only have to explicitly _declare dependencies_.

[//]: # (- _Multipurpose_: The same build script can be used for incremental batch builds in a terminal, but also for live feedback in an interactive environment such as an IDE. For example, a compiler implemented in this build system can provide incremental batch compilation but also incremental editor services such as syntax highlighting or code completion.)

[//]: # ()
[//]: # (#### Teaser Toy Example)

To show the benefits of a build system with these key properties, below is a simplified version of the build script for compiling a formal grammar and parsing text with that compiled grammar, which is the build script you will implement in the [final project chapter](../4_example/index.md).
This simplified version removes details that are not important for understanding programmatic incremental build systems at this moment.

Expand All @@ -44,8 +34,8 @@ This example is primarily here to motivate programmatic incremental build system

```rust
pub enum ParseTasks {
CompileGrammar { grammar_file_path: PathBuf },
Parse { compile_grammar_task: Box<ParseTasks>, program_file_path: PathBuf, rule_name: String }
CompileGrammar { grammar_file: PathBuf },
Parse { compile_grammar_task: Box<ParseTasks>, text_file: PathBuf, rule_name: String }
}

pub enum Outputs {
Expand All @@ -56,15 +46,15 @@ pub enum Outputs {
impl Task for ParseTasks {
fn execute<C: Context>(&self, context: &mut C) -> Result<Outputs, Error> {
match self {
ParseTasks::CompileGrammar { grammar_file_path } => {
let grammar_text = context.require_file(grammar_file_path)?;
let compiled_grammar = CompiledGrammar::new(&grammar_text, Some(grammar_file_path))?;
ParseTasks::CompileGrammar { grammar_file } => {
let grammar_text = context.require_file(grammar_file)?;
let compiled_grammar = CompiledGrammar::new(&grammar_text)?;
Ok(Outputs::CompiledGrammar(compiled_grammar))
}
ParseTasks::Parse { compile_grammar_task, program_file_path, rule_name } => {
ParseTasks::Parse { compile_grammar_task, text_file, rule_name } => {
let compiled_grammar = context.require_task(compile_grammar_task)?;
let program_text = context.require_file_to_string(program_file_path)?;
let output = compiled_grammar.parse(&program_text, rule_name, Some(program_file_path))?;
let text = context.require_file(text_file)?;
let output = compiled_grammar.parse(&text, rule_name)?;
Ok(Outputs::Parsed(output))
}
}
Expand All @@ -73,16 +63,16 @@ impl Task for ParseTasks {

fn main() {
let compile_grammar_task = Box::new(ParseTasks::CompileGrammar {
grammar_file_path: PathBuf::from("grammar.pest")
grammar_file: PathBuf::from("grammar.pest")
});
let parse_1_task = ParseTasks::Parse {
compile_grammar_task: compile_grammar_task.clone(),
program_file_path: PathBuf::from("test_1.txt"),
text_file: PathBuf::from("test_1.txt"),
rule_name: "main"
};
let parse_2_task = ParseTasks::Parse {
compile_grammar_task: compile_grammar_task.clone(),
program_file_path: PathBuf::from("test_2.txt"),
text_file: PathBuf::from("test_2.txt"),
rule_name: "main"
};

Expand All @@ -96,73 +86,93 @@ fn main() {

This is in essence just a normal (pure) Rust program: it has enums, a trait implementation for one of those enums, and a `main` function.
However, this program is also a build script because `ParseTasks` implements the `Task` trait, which is the core trait defining the unit of computation in a programmatic incremental build system.
Because `ParseTasks` is an enum, there are two kinds of tasks: a `CompileGrammar` task that compiles a grammar, and a `Parse` task that parses a text file using the compiled grammar.

##### Tasks

A task is kind of like a closure, a function along with its inputs that can be executed, but incremental.
For example, `ParseTasks::CompileGrammar` carries `grammar_file_path` which is the file path of the grammar that it will compile.
When we `execute` a `ParseTasks::CompileGrammar` task, it reads the text of the grammar from the file, compiles that text into a grammar, and returns a compiled grammar.
A _task_ is kind of like a closure: a function along with its inputs that can be executed.
For example, `CompileGrammar` carries `grammar_file_path` which is the file path of the grammar that it will compile.
When we `execute` a `CompileGrammar` task, it reads the text of the grammar from the file, compiles that text into a grammar, and returns a compiled grammar.

Tasks differ from closures however, in that tasks are _incremental_.

##### Incremental File Dependencies

However, we want this task to be incremental, such that this task is only re-executed when the contents of the `grammar_file_path` file changes.
We want the `CompileGrammar` task to be incremental, such that this task is only re-executed when the contents of the `grammar_file` file changes.
Therefore, `execute` has a `context` parameter which is an _incremental build context_ that tasks use to tell the build system about dependencies.
For example, `ParseTasks::CompileGrammar` tells the build system that it _requires_ the `grammar_file_path` file with `context.require_file(grammar_file_path)`, creating a _read file dependency_ to that file.

For example, `CompileGrammar` tells the build system that it _requires_ the `grammar_file` file with `context.require_file(grammar_file)`, creating a _file read dependency_ to that file.
It is then the responsibility of the incremental build system to only execute this task if the file contents have changed.

##### Dynamic Dependencies

Note that this file dependency is created _while the task is executing_.
Note that this file dependency is created _while the task is executing_!
We call these _dynamic dependencies_, as opposed to static dependencies.
Dynamic dependencies enable the _programmatic_ part of programmatic incremental build systems, because dependencies are made while your program is running, and can thus depend on values computed earlier in your program.

Another benefit of dynamic dependencies is that they enable _exact_ dependencies: the dependencies of a task exactly describe when the task should be re-executed, increasing incrementality.
With static dependencies that are hardcoded into the build script, you often have to over-approximate dependencies, leading to reduced incrementality.

##### Incremental Task Dependencies

Dynamic dependencies are also created _between tasks_.
For example, `ParseTasks::Parse` carries `compile_grammar_task` which is an instance of the `ParseTasks::CompileGrammar` task to compile a grammar.
When we `execute` a `ParseTasks::Parse` task, it tells the build system that it depends on the compile grammar task with `context.require_task(compiled_grammar_task)`, but also asks the build system to return the most up-to-date (consistent) output of that task.
For example, `Parse` carries `compile_grammar_task` which is an instance of the `CompileGrammar` task to compile a grammar.
When we `execute` a `Parse` task, it tells the build system that it depends on the compile grammar task with `context.require_task(compile_grammar_task)`.

This also asks the build system to return the most up-to-date (consistent) output of that task.
It is then the responsibility of the incremental build system to _check_ whether the task is _consistent_, and to _re-execute_ it only if it is _inconsistent_.
In essence, the build system will take these steps:

If `compile_grammar_task` was never executed before, the build system executes it, caches the compiled grammar, and returns the compiled grammar.
Otherwise, to check if the compile grammar task is consistent, we need to check the file dependency to `grammar_file_path` that `ParseTasks::CompileGrammar` created earlier.
If the contents of the `grammar_file_path` file has changed, the task is inconsistent and the build system re-executes it, caches the new compiled grammar, and returns it.
Otherwise, the build system simply returns the cached compiled grammar.
- If `compile_grammar_task` was never executed before, the build system executes it, caches the compiled grammar, and returns the compiled grammar.
- Otherwise, to check if the compile grammar task is consistent, we need to check its dependencies: the file dependency to `grammar_file`.
- If the contents of the `grammar_file` file has changed, the task is inconsistent and the build system re-executes it, caches the new compiled grammar, and returns it.
- Otherwise, the task is consistent and the build system simply returns the cached compiled grammar.

The `Parse` task then has access to the `compiled_grammar`, reads the text file to parse with `require_file`, and finally parses the `text` with `compiled_grammar.parse`.

##### Using Tasks

Because this is just a regular Rust program, we can use the tasks in the same program with a `main` function.
The `main` function creates instances of these tasks, creates an `IncrementalBuildContext`, and asks the build system to return the up-to-date outputs for two tasks with `context.require_task`.

This `main` function is performing an incremental batch build.
However, you can also use these same tasks to build an _interactive application_.
That is too much code to discuss here in this introduction, but the [final project chapter](../4_example/index.md) shows a video of an interactive application that you can build using these tasks.

##### Conclusion

This is the essence of programmatic incremental build systems.
In this tutorial, we will define the `Task` trait and implement the `IncrementalBuildContext`.
However, before we start doing that, I want to first zoom back out and discuss the benefits of programmatic incremental build systems.
In this tutorial, we will define the `Task` trait and implement the `IncrementalBuildContext` over the course of several chapters.

However, before we start doing that, I want to first zoom back out and discuss the benefits (and drawbacks) of programmatic incremental build systems.
If you already feel motivated enough, you can [skip to here](#pie-a-programmatic-incremental-build-system-in-rust).

### Benefits

I prefer writing builds in a programming language like this, over having to _encode_ a build into a YAML file with underspecified semantics, and over having to learn and use a new build scripting language with limited tooling.
By _programming builds_, I can reuse my knowledge of the programming language, I get help from the compiler and IDE that I'd normally get while programming, I can modularize and reuse parts of my build as a library, and can use other programming language features such as unit testing, integration testing, benchmarking, etc.
The primary motivation for programmatic incremental build systems is that you can _program_ your incremental builds and interactive applications in a regular programming language, instead of having to write it in a separate (declarative) build script language, and this has several benefits:

- You can re-use your knowledge of the programming language, instead of having to learn a new build script language.
- You can use tools of the programming language, such as the compiler that provides (good) error messages, an IDE that helps you read and write code, a debugger for understanding the program, unit and integration testing for improving code reliability, benchmarking for improving performance, etc.
- You can modularize your build script using facilities of the programming language, enabling you to reuse your build script as a library or to use modules created by others in your build script. You can also use regular modules of the programming language and integrate them into build scripts, and vice versa.

The other important benefit is that incrementality and correctness are taken care of by the build system.
Therefore, you don't have to manually implement incrementality in a correct way, which is complicated and error-prone to implement.

You do have to specify the exact dependencies of tasks to files and other tasks, as seen in the example, but this is easier than implementing incrementality.
Due to the dependencies being dynamic, you can use regular programming language constructs like calling a function to figure out what file to depend on, `if` to create conditional dependencies, `while` to create multiple dependencies, and so forth.

Programmatic builds _do not exclude declarativity_, however.
You can layer declarative features on top of programmatic builds, such as declarative configuration files that determine _what_ should be built without having to specify _how_ things are built.
For example, you could write a task like the one from the example, which reads and parses a config file, and then dispatch tasks that build required things.
Therefore, programmatic builds are useful for both small one-off builds, and for creating larger incremental build systems that work with a lot of user inputs.
Exactly specifying the dependencies in this way has another important benefit: the dynamic dependencies of a task _perfectly describe when the task should be re-executed_, enabling the build system to be fully incremental and correct.
This is in contrast to build system with static dependencies -- dependencies that cannot use runtime values, typically using literal file names or patterns -- where dependencies often have to be over-approximated (not fully incremental) or under-approximated (not correct) due to not being able to exactly specify dependencies.

Dynamic dependencies enable creating precise dependencies, _without requiring staging_, as is often found in build systems with static dependencies.
For example, dynamic dependencies in [Make](https://www.gnu.org/software/make/) requires staging: generate new makefiles and recursively execute them, which is tedious and error-prone.
[Gradle](https://gradle.org/) has a two-staged build process: first configure the task graph, then incrementally execute it.
In the execution stage, you cannot modify dependencies or create new tasks.
Therefore, more work needs to be done in the configuration stage, which is not (fully) incrementalized.
Dynamic dependencies solve these problems by doing away with staging!
Some build systems use _multiple stages_ to emulate a limited form of dynamic dependencies.
For example, dynamic dependencies in [Make](https://www.gnu.org/software/make/) requires staging: first dynamically generate new makefiles with correct dependencies, and then recursively execute them.
[Gradle](https://gradle.org/) has a two-staged build process: first configure the task graph, then incrementally execute it, but no new dependencies nor tasks can be created during execution.
This is an improvement over static dependencies, but requires you to think about what to do in each stage, requires maintenance of each stage, and limits what you can do in each stage.

Finally, precise dynamic dependencies enable incrementality but also correctness.
A task is re-executed when one or more of its dependencies become inconsistent.
For example, the `WriteFile` task from the example is re-executed when the task dependency returns different text, or when the file it writes to is modified or deleted.
This is both incremental and correct.
A final benefit of dynamic dependencies is that they do away with staging because there is only a single stage: the execution of your build script, and you can create dynamic dependencies in this single stage.
This increases expressiveness, makes build scripts easier to read and write, and reduces maintenance overhead.

### Disadvantages
### Drawbacks

Of course, programmatic incremental build systems also have some disadvantages.
These disadvantages become more clear during the tutorial, but I want to list them here to be up-front about it:
Of course, programmatic incremental build systems also have some drawbacks.
These drawbacks become more clear during the tutorial, but I want to list them here to be up-front about it:

- The build system is more complicated, but hopefully this tutorial can help mitigate some of that by understanding the key ideas through implementation and experimentation.
- Some correctness properties are checked while building. Therefore, you need to test your builds to try to catch these issues before they reach users. However, I think that testing builds is something you should do regardless of the build system, to be more confident about the correctness of your build.
Expand Down
6 changes: 1 addition & 5 deletions src/1_programmability/1_api/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,18 +59,14 @@ Because of this, the context reference passed to `Task::execute` is also mutable

This `Task` and `Context` API mirrors the mutually recursive definition of task and context we discussed earlier, and forms the basis for the entire build system.

```admonish important title="File Dependencies: Next Chapter"
We will implement file dependencies in the next chapter, as file dependencies only become important with incrementality.
```

Build the project by running `cargo build`.
The output should look something like:

```shell,
{{#include ../../gen/1_programmability/1_api/a_cargo.txt}}
```

In the next section, we will implement a non-incremental `Context` and test it against `Task` implementations.
In the next section, we will implement a _non-incremental_ `Context` and test it against `Task` implementations.

```admonish tip title="Rust Help: Modules, Imports, Ownership, Traits, Methods, Supertraits, Associated Types, Visibility" collapsible=true
[The Rust Programming Language](https://doc.rust-lang.org/book/ch00-00-introduction.html) is an introductory book about Rust. I will try to provide links to the book where possible.
Expand Down
Loading

0 comments on commit 4a60ded

Please sign in to comment.