Skip to content

Commit

Permalink
Make self-hosted compiler the default and introduce bootstrapping (#573)
Browse files Browse the repository at this point in the history
  • Loading branch information
Akuli authored Jan 9, 2025
1 parent 9127e38 commit 1aef25a
Show file tree
Hide file tree
Showing 64 changed files with 198 additions and 280 deletions.
26 changes: 9 additions & 17 deletions .github/workflows/linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,15 @@ jobs:
- uses: actions/checkout@v3
- run: sudo apt update
- run: sudo apt install -y llvm-${{ matrix.llvm-version }}-dev clang-${{ matrix.llvm-version }} make
- run: LLVM_CONFIG=llvm-config-${{ matrix.llvm-version }} make
- run: ./runtests.sh --verbose --jou-flags "${{ matrix.opt-level }}"
- run: ./runtests.sh --verbose --jou-flags "${{ matrix.opt-level }} --verbose"
- run: echo "LLVM_CONFIG=llvm-config-${{ matrix.llvm-version }}" >> $GITHUB_ENV
- name: "Compile and test stage 1 compiler"
run: ./runtests.sh --verbose --jou-flags "${{ matrix.opt-level }}" --stage1
- name: "Compile and test stage 2 compiler"
run: ./runtests.sh --verbose --jou-flags "${{ matrix.opt-level }}" --stage2
- name: "Compile and test stage 3 compiler"
run: ./runtests.sh --verbose --jou-flags "${{ matrix.opt-level }}"
- name: "Test stage 3 compiler with the compiler's --verbose flag"
run: ./runtests.sh --verbose --jou-flags "${{ matrix.opt-level }} --verbose"
- run: make clean
- name: Check that "make clean" deleted all files not committed to Git
run: |
Expand All @@ -59,17 +65,3 @@ jobs:
- run: LLVM_CONFIG=llvm-config-13 ./doctest.sh
- run: make clean
- run: LLVM_CONFIG=llvm-config-14 ./doctest.sh

compare-compilers:
runs-on: ubuntu-22.04
timeout-minutes: 5
steps:
- uses: actions/checkout@v3
- run: sudo apt update
- run: sudo apt install -y llvm-{11,13,14}-dev clang-{11,13,14} make

- run: LLVM_CONFIG=llvm-config-11 ./compare_compilers.sh
- run: make clean
- run: LLVM_CONFIG=llvm-config-13 ./compare_compilers.sh
- run: make clean
- run: LLVM_CONFIG=llvm-config-14 ./compare_compilers.sh
18 changes: 8 additions & 10 deletions .github/workflows/macos.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,14 @@ jobs:
- run: brew install bash diffutils llvm@14

- run: make
- run: ./runtests.sh --verbose --jou-flags "${{ matrix.opt-level }}"
- run: ./runtests.sh --verbose --jou-flags "${{ matrix.opt-level }} --verbose"
- name: "Compile and test stage 1 compiler"
run: ./runtests.sh --verbose --jou-flags "${{ matrix.opt-level }}" --stage1
- name: "Compile and test stage 2 compiler"
run: ./runtests.sh --verbose --jou-flags "${{ matrix.opt-level }}" --stage2
- name: "Compile and test stage 3 compiler"
run: ./runtests.sh --verbose --jou-flags "${{ matrix.opt-level }}"
- name: "Test stage 3 compiler with the compiler's --verbose flag"
run: ./runtests.sh --verbose --jou-flags "${{ matrix.opt-level }} --verbose"
- run: make clean

- name: Check that "make clean" deleted all files not committed to Git
Expand All @@ -36,11 +42,3 @@ jobs:
- uses: actions/checkout@v3
- run: brew install bash diffutils llvm@14
- run: ./doctest.sh

compare-compilers:
runs-on: macos-latest
timeout-minutes: 5
steps:
- uses: actions/checkout@v3
- run: brew install bash diffutils llvm@14
- run: ./compare_compilers.sh
5 changes: 3 additions & 2 deletions .github/workflows/netbsd.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,9 @@ jobs:
run: |
sudo pkgin -y install clang libLLVM gmake diffutils && \
gmake && \
./runtests.sh --verbose --stage1 && \
./runtests.sh --verbose --stage2 && \
./runtests.sh --verbose && \
./runtests.sh --verbose --jou-flags "--verbose" && \
gmake clean && \
./doctest.sh && \
./compare_compilers.sh
./doctest.sh
9 changes: 9 additions & 0 deletions .github/workflows/valgrind.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,15 @@ jobs:
run: LLVM_CONFIG=llvm-config-${{ matrix.llvm-version }} make

- if: env.recent_commits == 'true'
name: "Test stage1 compiler with valgrind"
run: ./runtests.sh --verbose --valgrind --jou-flags "${{ matrix.opt-level }}" --stage1

- if: env.recent_commits == 'true'
name: "Test stage2 compiler with valgrind"
run: ./runtests.sh --verbose --valgrind --jou-flags "${{ matrix.opt-level }}" --stage2

- if: env.recent_commits == 'true'
name: "Test stage3 compiler with valgrind"
run: ./runtests.sh --verbose --valgrind --jou-flags "${{ matrix.opt-level }}"

# Based on: https://github.com/python/typeshed/blob/9f28171658b9ca6c32a7cb93fbb99fc92b17858b/.github/workflows/daily.yml
Expand Down
19 changes: 8 additions & 11 deletions .github/workflows/windows.yml
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,14 @@ jobs:
path: "test dir"
- run: cd "test dir" && ./windows_setup.sh --small
shell: bash
- run: cd "test dir" && source activate && ./runtests.sh --verbose
- name: "Compile and test stage 1 compiler"
run: cd "test dir" && source activate && ./runtests.sh --verbose --stage1
shell: bash
- name: "Compile and test stage 2 compiler"
run: cd "test dir" && source activate && ./runtests.sh --verbose --stage2
shell: bash
- name: "Compile and test stage 3 compiler"
run: cd "test dir" && source activate && ./runtests.sh --verbose
shell: bash

doctest:
Expand Down Expand Up @@ -188,13 +195,3 @@ jobs:
shell: bash
- run: cd jou && ./runtests.sh --dont-run-make --verbose
shell: bash

compare-compilers:
runs-on: windows-latest
timeout-minutes: 5
steps:
- uses: actions/checkout@v3
- run: source activate && ./windows_setup.sh --small
shell: bash
- run: source activate && ./compare_compilers.sh
shell: bash
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
/obj/
/tmp/
/jou_stage1
/jou_stage2
/jou
/self_hosted_compiler
/compile_flags.txt
/config.h
/config.jou
Expand Down
31 changes: 19 additions & 12 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,14 +66,19 @@ to help `clangd` find the LLVM header files.
## How does the compiler work?
The compiler is currently written in C. At a high level, the compilation steps are:
- Tokenize: split the source code into tokens
- Parse: build an Abstract Syntax Tree from the tokens
- Typecheck: errors for wrong number or type of function arguments etc, figure out the type of each expression in the AST
- Build CFG: build Control Flow Graphs for each function from the AST
- Simplify CFG: simplify and analyze the control flow graphs in various ways, emit warnings as needed
- Codegen: convert the CFGs into LLVM IR
- Invoke `clang` and pass it the generated LLVM IR
There are two compilers. See [the README](README.md#compilers) for an explanation.
The compilers work very similarly even though they are written in different languages.
At a high level, the compilation steps are:
- **Tokenize:** split the source code into tokens
- **Parse:** build an abstract syntax tree (AST) from the tokens
- **Typecheck:** errors for wrong number or type of function arguments etc, figure out the type of each expression in the AST
- **Build CFG:** build Control Flow Graphs for each function from the AST
- **Simplify CFG:** simplify and analyze the control flow graphs in various ways, generate warnings and errors based on them
- **Codegen:** convert the CFGs into LLVM IR
- **Emit objects:** create `.o` files from the LLVM IR
- **Link:** run a linker that combines the `.o` files into an executable
- **Run:** run the executable
To get a good idea of how these steps work,
you can run the compiler in verbose mode:
Expand All @@ -88,8 +93,10 @@ the tokens, AST, CFGs and LLVM IR generated.
The control flow graphs are shown twice, before and after simplifying them.
Similarly, LLVM IR is shown before and after optimizing.
After exploring the verbose output, you should probably
read `src/jou_compiler.h` and have a quick look at `src/util.h`.
After making changes to the compiler, run `make` to recompile it.
To make recompiling faster, only the stage 3 compiler (`./jou` or `jou.exe`)
will be recompiled.
All stages of bootstrapping are recompiled if any file in `bootstrap_compiler` is modified (or `touch`ed).
## Tests
Expand All @@ -107,7 +114,7 @@ $ ./runtests.sh
```
The `runtests.sh` script does a few things:
- It compiles the Jou compiler if you have changed something in `src/` since the last time it was compiled.
- It compiles the Jou compiler if you have changed the compiler since the last time it was compiled.
- It runs all Jou files in `examples/` and `tests/`. To speed things up, it runs two files in parallel.
- It ensures that the Jou files output what is expected.
Expand Down Expand Up @@ -159,7 +166,7 @@ This doesn't do anything with tests that are supposed to fail with an error, for
This is fine because the operating system will free the memory anyway,
but `valgrind` would see it as many memory leaks.
- Valgrinding is slow. Most tests are about compiler errors,
and `make valgrind` would take several minutes if they weren't skipped.
and valgrinding would take several minutes if they weren't skipped.
- Most problems in error message code are spotted by non-valgrinded tests.
There are also a few other ways to run the tests.
Expand Down
3 changes: 0 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,6 @@ CFLAGS += -Werror=switch -Werror=implicit-function-declaration -Werror=incompati
CFLAGS += -std=c11
CFLAGS += -g

SRC := $(wildcard src/*.c)
OBJ := $(SRC:src/%.c=obj/%.o)

ifneq (,$(findstring Windows,$(OS)))
include Makefile.windows
else
Expand Down
29 changes: 18 additions & 11 deletions Makefile.posix
Original file line number Diff line number Diff line change
Expand Up @@ -25,32 +25,39 @@ ifeq ($(CC),cc)
)
endif

all: jou compile_flags.txt
all: compile_flags.txt jou

# point clangd to the right include folder so i don't get red squiggles in my editor
compile_flags.txt:
echo "-I$(shell $(LLVM_CONFIG) --includedir)" > compile_flags.txt

config.h:
@v=`$(LLVM_CONFIG) --version`; case "$$v" in 11.*|13.*|14.*) ;; *) echo "Error: Found unsupported LLVM version $$v. Only LLVM 11, 13 and 14 are supported."; exit 1; esac
echo "// auto-generated by Makefile" > config.h
echo "#define JOU_CLANG_PATH \"$(CC)\"" >> config.h

obj/%.o: src/%.c $(wildcard src/*.h) config.h
mkdir -p obj && $(CC) -c $(CFLAGS) $< -o $@

jou: $(SRC:src/%.c=obj/%.o)
$(CC) $(CFLAGS) $^ -o $@ $(LDFLAGS)

config.jou:
@v=`$(LLVM_CONFIG) --version`; case "$$v" in 11.*|13.*|14.*) ;; *) echo "Error: Found unsupported LLVM version $$v. Only LLVM 11, 13 and 14 are supported."; exit 1; esac
echo "# auto-generated by Makefile" > config.jou
echo "def get_jou_clang_path() -> byte*:" >> config.jou
echo " return \"$(CC)\"" >> config.jou

self_hosted_compiler: jou config.jou $(wildcard self_hosted/*.jou)
./jou -o $@ --linker-flags "$(LDFLAGS)" self_hosted/main.jou
# Stage 1 of bootstrapping: Compile the bootstrap compiler with a C compiler.
BSRC := $(wildcard bootstrap_compiler/*.c)
obj/%.o: bootstrap_compiler/%.c $(wildcard bootstrap_compiler/*.h) config.h
mkdir -p obj && $(CC) -c $(CFLAGS) $< -o $@
jou_stage1: $(BSRC:bootstrap_compiler/%.c=obj/%.o)
$(CC) $(CFLAGS) $^ -o jou_stage1 $(LDFLAGS)

# Stage 2 of bootstrapping: Compile the Jou compiler with the bootstrap compiler.
# Don't depend on Jou files, so that only stage 3 recompiles if they're changed.
jou_stage2: jou_stage1 config.jou
rm -rf compiler/jou_compiled && ./jou_stage1 -o jou_stage2 --linker-flags "$(LDFLAGS)" compiler/main.jou

# Stage 3 of bootstrapping: Compile the Jou compiler with the Jou compiler.
jou: jou_stage2 config.jou $(wildcard compiler/*.jou)
rm -rf compiler/jou_compiled && ./jou_stage2 -o jou --linker-flags "$(LDFLAGS)" compiler/main.jou

.PHONY: clean
clean:
rm -rvf obj jou jou.exe self_hosted_compiler self_hosted_compiler.exe tmp config.h config.jou compile_flags.txt
rm -rvf obj jou jou.exe jou_stage* tmp compile_flags.txt config.jou config.h
find . -name jou_compiled -print -exec rm -rf '{}' +
27 changes: 17 additions & 10 deletions Makefile.windows
Original file line number Diff line number Diff line change
Expand Up @@ -13,27 +13,34 @@ endif
# it shows a lot of unnecssary/dumb warnings by default
CFLAGS += -Wno-return-type -Wno-uninitialized -Wno-implicit-fallthrough

all: jou.exe compile_flags.txt
all: compile_flags.txt jou.exe

# point clangd to the right include folder so i don't get red squiggles in my editor
compile_flags.txt:
echo "-I$(CURDIR)" > compile_flags.txt

obj/%.o: src/%.c $(wildcard src/*.h)
mkdir -p obj && $(CC) -c $(CFLAGS) $< -o $@

jou.exe: $(OBJ)
$(CC) $(CFLAGS) $(OBJ) -o $@ $(LDFLAGS)

config.jou:
echo "# auto-generated by Makefile" > config.jou
echo "def get_jou_clang_path() -> byte*:" >> config.jou
echo " return NULL" >> config.jou

self_hosted_compiler.exe: jou.exe config.jou $(wildcard self_hosted/*.jou)
./jou.exe -o $@ --linker-flags "$(LDFLAGS)" self_hosted/main.jou
# Stage 1 of bootstrapping: Compile the bootstrap compiler with a C compiler.
BSRC := $(wildcard bootstrap_compiler/*.c)
obj/%.o: bootstrap_compiler/%.c $(wildcard bootstrap_compiler/*.h)
mkdir -p obj && $(CC) -c $(CFLAGS) $< -o $@
jou_stage1.exe: $(BSRC:bootstrap_compiler/%.c=obj/%.o)
$(CC) $(CFLAGS) $^ -o jou_stage1.exe $(LDFLAGS)

# Stage 2 of bootstrapping: Compile the Jou compiler with the bootstrap compiler.
# Don't depend on Jou files, so that only stage 3 recompiles if they're changed.
jou_stage2.exe: jou_stage1.exe config.jou
rm -rf compiler/jou_compiled && ./jou_stage1.exe -o jou_stage2.exe --linker-flags "$(LDFLAGS)" compiler/main.jou

# Stage 3 of bootstrapping: Compile the Jou compiler with the Jou compiler.
jou.exe: jou_stage2.exe config.jou $(wildcard compiler/*.jou)
rm -rf compiler/jou_compiled && ./jou_stage2.exe -o jou.exe --linker-flags "$(LDFLAGS)" compiler/main.jou

.PHONY: clean
clean:
rm -rvf obj jou.exe self_hosted_compiler.exe tmp config.jou compile_flags.txt
rm -rvf obj jou.exe jou_stage*.exe tmp compile_flags.txt config.jou
find -name jou_compiled -print -exec rm -rf '{}' +
33 changes: 28 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,34 @@ is not currently supported.
Run `jou --update`.


## Compilers

The Jou compiler (in [compiler/](./compiler/) folder) is written in Jou.
It can compile itself.
However, this doesn't help you much if you have nothing that can compile Jou code.

To solve this problem, there is another compiler, called the **bootstrap compiler**
([bootstrap_compiler/](boostrap_compiler) folder),
written in C.
It is a Jou compiler that supports all of Jou's syntax,
but whose error messages are not always as good as the error messages of the main Jou compiler.
The bootstrap compiler exists only to compile the Jou compiler.

Specifically, here's how the Jou compiler is compiled.
This process is called [bootstrapping](https://en.wikipedia.org/wiki/Bootstrapping_(compilers)).
- **Stage 1: Compile the bootstrap compiler with a C compiler.**
This produces an executable named `jou_stage1`
(or `jou_stage1.exe` if you use Windows).
- **Stage 2: Compile the Jou compiler with the bootstrap compiler.**
This produces an executable named `jou_stage2`.
- **Stage 3: Compile the Jou compiler with the Jou compiler.**
The stage 2 compiler is used to compile the Jou compiler again.
This produces an executable named `jou`.

See [CONTRIBUTING.md](CONTRIBUTING.md)
if you want to learn more about the Jou compiler or develop it.


## Editor support

Tell your editor to syntax-highlight `.jou` files as if they were Python files.
Expand Down Expand Up @@ -236,8 +264,3 @@ autoindent_regexes = {dedent = 'return( .+)?|break|pass|continue', indent = '.*:

To apply this configuration, copy/paste it to end of Porcupine's `filetypes.toml`
(menubar at top --> *Settings* --> *Config Files* --> *Edit filetypes.toml*).


## How does the compiler work?

See [CONTRIBUTING.md](CONTRIBUTING.md).
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
3 changes: 3 additions & 0 deletions broken_tests/README.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Tests in this directory fail if they are moved into the tests folder.
This is because the self-hosted compiler doesn't analyze the CFGs.
See issue #566.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Loading

0 comments on commit 1aef25a

Please sign in to comment.