Tan Li Hau (陈立豪) has written one of the best introductions to Babel I ever read. He is also a prolific youtuber. I strongly recommend you to follow his work and attend his lessons and read his books. These are my notes for his article "Creating custom JavaScript syntax with Babel" (September 25, 2019) available at https://lihautan.com/creating-custom-javascript-syntax-with-babel. Read this having Tan Li's article at hand.
Let me show you what we will achieve at the end of this article. We are going to create a curry function syntax
@@.
The syntax is like the generator function, except you place@@
instead of*
in between the function keyword and the function name, egfunction @@ name(arg1, arg2)
.
// '@@' makes the function `foo` curried function @@ foo(a, b, c) { return a + b + c; } console.log(foo(1, 2)(3)); // 6
In this example, you can have partial application with the function
foo
. Callingfoo
with the number of parameters less than the arguments required will return a new function of the remaining arguments:
foo(1, 2, 3); // 6 const bar = foo(1, 2); // (n) => 1 + 2 + n bar(3); // 6
I started forking Tan Li Hau babel fork of the repo instead of the main Babel repo and then I clone my fork at https://github.com/ULL-ESIT-PL/babel-tanhauhau:
gh repo clone ULL-ESIT-PL/babel-tanhauhau
The working space is in the learning/compiler-learning/babel-tanhauhau
folder:
➜ babel-tanhauhau git:(learning) ✗ pwd -P
/Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau
➜ babel-learning git:(main) date
June 2024, 12:30:50 WEST
➜ babel-learning git:(main) sw_vers
ProductName: macOS
ProductVersion: 14.5
BuildVersion: 23F79
➜ babel-learning git:(main) uname
Darwin
➜ babel-tanhauhau git:(feat/curry-function) ✗ node --version
v20.5.0
➜ babel-tanhauhau git:(feat/curry-function) ✗ npm --version
9.8.0
➜ babel-tanhauhau git:(feat/curry-function) ✗ nvm --version
0.35.3
There are currently three branches in the repository ULL-ESIT-PL/babel-tanhauhau:
➜ babel-tanhauhau git:(feat/curry-function) ✗ git -P branch
* feat/curry-function
learning
master
The branch feat/curry-function
is the one with Tan Li Hau's solution.
Branch master
is the original Babel repo. You can start from here and try to reproduce the article. The last commit in there is from 4 years ago:
➜ babel-tanhauhau git:(master) ✗ git lg
e498bee10 - (HEAD -> master, upstream/master, origin/master, origin/HEAD) replace whitelist by allowlist in parser-tests (#11727) (hace 4 años Huáng Jùnliàng)
fd3c76941 - [gitpod] Run "make watch" in a second terminal (#11718) (hace 4 años Nicolò Ribaudo)
e15a5c750 - Fix innercomments (#11697) (hace 4 años 骗你是小猫咪)
You can find the version I modified starting from master
in the branch learning
.
➜ babel-tanhauhau git:(learning) ✗ git diff --name-only master
.vscode/settings.json
.vscode/settings.json
packages/babel-parser/src/parser/expression.js
packages/babel-parser/src/parser/statement.js
packages/babel-parser/src/tokenizer/index.js
packages/babel-parser/src/tokenizer/types.js
packages/babel-parser/test/curry-function.cjs
packages/babel-parser/test/curry-function.js
Then I realized that I have to install yarn and gulp to build the project.
➜ babel-tanhauhau git:(master) pwd -P
/Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau
babel-tanhauhau git:(master) npm i -g yarn
babel-tanhauhau git:(master) npm i -g gulp
Here are the versions I installed (Notice I'm in the babel-learning
folder corresponding to the tutorial, not in the babel-tanhauhau
folder corresponding to the cloned repo):
➜ babel-learning git:(main) yarn --version
1.22.22
➜ babel-learning git:(main) ✗ (cd babel-tanhauhau/ && gulp --version)
CLI version: 3.0.0
Local version: 4.0.2
then I proceed to make the bootstrap:
➜ babel-tanhauhau git:(master) make bootstrap
the first time I was using node v21.2.0 and nvm 0.35.3.
There were errors with node-gyp.
I found that node-gyp is a cross-platform command-line tool written in Node.js for compiling native addon modules for Node.js.
It contains a vendored copy of the gyp-next
project that was previously used by the
Chromium team and extended to support the development of Node.js native addons. Native modules refers to the modules that are written outside of JavaScript, modules that are written in C++ (C++ addons) for example and embedded into JavaScript using things like N-API (Node-API).
These were the errors:
gyp info find Python using Python version 3.11.4 found at \"/Users/casianorodriguezleon/.pyenv/versions/3.11.4/bin/python\"
(node:29944) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
gyp ERR! UNCAUGHT EXCEPTION
gyp ERR! stack TypeError: Cannot assign to read only property 'cflags' of object '#<Object>'
gyp ERR! stack at createConfigFile (/Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau/node_modules/node-gyp/lib/configure.js:118:21)
gyp ERR! stack at /Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau/node_modules/node-gyp/lib/configure.js:85:9
gyp ERR! stack at /Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau/node_modules/mkdirp/index.js:30:20
gyp ERR! stack at FSReqCallback.oncomplete (node:fs:189:23)
gyp ERR! System Darwin 23.5.0
gyp ERR! command \"/Users/casianorodriguezleon/.nvm/versions/node/v21.2.0/bin/node\" \"/Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau/node_modules/node-gyp/bin/node-gyp.js\" \"configure\" \"--fallback-to-build\" \"--module=/Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau/node_modules/fsevents/lib/binding/Release/node-v120-darwin-x64/fse.node\" \"--module_name=fse\" \"--module_path=/Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau/node_modules/fsevents/lib/binding/Release/node-v120-darwin-x64\" \"--napi_version=9\" \"--node_abi_napi=napi\" \"--napi_build_version=0\" \"--node_napi_label=node-v120\"
gyp ERR! cwd /Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau/node_modules/fsevents
gyp ERR! node -v v21.2.0
gyp ERR! node-gyp -v v5.0.5
gyp ERR! This is a bug in `node-gyp`.
gyp ERR! Try to update node-gyp and file an Issue if it does not help:
gyp ERR! <https://github.com/nodejs/node-gyp/issues>
node-pre-gyp ERR! build error
node-pre-gyp ERR! stack Error: Failed to execute '/Users/casianorodriguezleon/.nvm/versions/node/v21.2.0/bin/node /Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau/node_modules/node-gyp/bin/node-gyp.js configure --fallback-to-build --module=/Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau/node_modules/fsevents/lib/binding/Release/node-v120-darwin-x64/fse.node --module_name=fse --module_path=/Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau/node_modules/fsevents/lib/binding/Release/node-v120-darwin-x64 --napi_version=9 --node_abi_napi=napi --napi_build_version=0 --node_napi_label=node-v120' (7)
node-pre-gyp ERR! stack at ChildProcess.<anonymous> (/Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau/node_modules/fsevents/node_modules/node-pre-gyp/lib/util/compile.js:83:29)
node-pre-gyp ERR! stack at ChildProcess.emit (node:events:519:28)
node-pre-gyp ERR! stack at maybeClose (node:internal/child_process:1105:16)
node-pre-gyp ERR! stack at ChildProcess._handle.onexit (node:internal/child_process:305:5)
node-pre-gyp ERR! System Darwin 23.5.0
node-pre-gyp ERR! command \"/Users/casianorodriguezleon/.nvm/versions/node/v21.2.0/bin/node\" \"/Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau/node_modules/fsevents/node_modules/node-pre-gyp/bin/node-pre-gyp\" \"install\" \"--fallback-to-build\"
node-pre-gyp ERR! cwd /Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau/node_modules/fsevents
node-pre-gyp ERR! node -v v21.2.0
node-pre-gyp ERR! node-pre-gyp -v v0.12.0
node-pre-gyp ERR! not ok
Failed to execute '/Users/casianorodriguezleon/.nvm/versions/node/v21.2.0/bin/node /Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau/node_modules/node-gyp/bin/node-gyp.js configure --fallback-to-build --module=/Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau/node_modules/fsevents/lib/binding/Release/node-v120-darwin-x64/fse.node --module_name=fse --module_path=/Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau/node_modules/fsevents/lib/binding/Release/node-v120-darwin-x64 --napi_version=9 --node_abi_napi=napi --napi_build_version=0 --node_napi_label=node-v120' (7)"
✨ Done in 29.07s.
Important
thus I tried again, this time with node v20.
➜ babel-tanhauhau git:(master) nvm use v20
Now using node v20.5.0 (npm v9.8.0)
➜ babel-tanhauhau git:(master) make bootstrap
It took a while to build the project, but there were no errors:
...
[12:33:49] Skipped minification of 'babel-tanhauhau/packages/babel-standalone/babel.js' because not publishing
[12:33:49] Finished 'build-babel-standalone' after 29 s
Important
I have later tried with version 22.2.0 and also with problems. So be aware of the version of node you are using.
Thus I proceed to the make build
:
babel-tanhauhau git:(master) make build
...
[12:37:46] Skipped minification of 'babel-tanhauhau/packages/babel-standalone/babel.js' because not publishing
[12:37:46] Finished 'build-babel-standalone' after 18 s
I then runned the tests.
babel-tanhauhau git:(master) ✗ make test
BABEL_ENV=test yarn --silent eslint scripts packages codemods eslint '*.js' --format=codeframe
Most of them passed but there were some errors. For instance:
FAIL packages/babel-plugin-transform-dotall-regex/test/index.js
● babel-plugin-transform-dotall-regex/dotall regex › with unicode property escape
I will try to find out what is the reason later.
There are several scripts in the package.json
that are alias of make
commands:
➜ babel-tanhauhau git:(adrian-casiano) ✗ jq '.scripts' package.json
{
"bootstrap": "make bootstrap",
"codesandbox": "make bootstrap-only; make build-no-bundle",
"build": "make build",
"fix": "make fix",
"lint": "make lint",
"test": "make test"
}
See section doc/vscode-flow-config.md on how to configure VSCode to work with Babel files in Flow.
See also section doc/vscode-typescript-config.md on how to configure VSCode to work with TypeScript files.
See section /doc/git-hooks-configuration.md on how to survive with pre-commit hooks.
export HUSKY=0 # Disables all Git hooks
We can take advantage of npx
to have at hand the executables of our babel version by creating a
symbolic link mybabel
to your version of babel.js
script in the node_modules/.bin
folder.
➜ babel-learning git:(main) pwd -P
/Users/casianorodriguezleon/campus-virtual/2324/learning/babel-learning
➜ babel-learning git:(main) ✗ cd node_modules/.bin
➜ .bin git:(main) ✗ ln -s /Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau/packages/babel-cli/bin/babel.js mybabel
➜ babel-learning git:(main) ✗ cd ../..
➜ babel-learning git:(main) ✗ chmod a+x node_modules/.bin/mybabel
➜ babel-learning git:(main) ✗ npx mybabel --version
7.10.1 (@babel/core 7.10.2)
➜ babel-learning git:(main) ✗ ln -s /Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau-adrian/packages/babel-parser/bin/babel-parser.js node_modules/.bin/adrianparser
➜ babel-learning git:(main) ✗ ln -s /Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau-adrian/packages/babel-cli/bin/babel.js node_modules/.bin/adrianbabel
We can do the same with the parser so that we can use it from the babel-learning
folder by just running
npx myparser
:
➜ babel-learning git:(main) ln -s /Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau/packages/babel-parser/bin/babel-parser.js node_modules/.bin/myparser
Now we can use the parser from the babel-learning
folder:
➜ babel-learning git:(main) npx myparser src/tan-liu-article/example.js
➜ babel-learning git:(main) ✗ npx myparser src/tan-liu-article/example.js > ast.json
➜ babel-learning git:(main) ✗ jq '.program.body[0].curry' ast.json
true
I created a symbolic link to the babel-tanhauhau
folder containing the cloned babel inside the learning
folder
containing this tutorial:
➜ babel-learning git:(main) ✗ pwd -P
/Users/casianorodriguezleon/campus-virtual/2324/learning/babel-learning # <- this tutorial
➜ babel-learning git:(main) ls -l babel-tanhauhau
lrwxr-xr-x 1 casianorodriguezleon staff 90 30 may 12:02 babel-tanhauhau -> /Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau #<- the cloned babel repo
So, now I can work in the babel-tanhauhau
folder from the babel-learning
folder.
This way, in the future, when we have the lexical analysis and parsing phases implemented,
we can, for instance, use the parser o examples like this one in the babel-learning
folder:
➜ babel-learning git:(main) ✗ cat src/tan-liu-article/example.js
// '@@' makes the function `foo` curried
function @@ foo(a, b, c) {
return a + b + c;
}
console.log(foo(1, 2)(3)); // 6
To use the parser in the babel-tanhauhau
folder, I can simply call the /bin/babel-parser.js
script from the babel-tanhauhau
folder1:
➜ babel-learning git:(main) babel-tanhauhau/packages/babel-parser/bin/babel-parser.js src/tan-liu-article/example.js |\
jq '.program.body[0]' > salida.json
Of course, this assumes that the working copy of the babel-tanhauhau
folder is in a branch with the changes implemented, like feat/curry-function
:
➜ babel-learning git:(main) (cd babel-tanhauhau/ && git -P branch )
* feat/curry-function
learning
master
And here is the AST that was stored in salida.json
in yml
format2:
➜ babel-learning git:(main) ✗ compast -n salida.json
type: "FunctionDeclaration"
id:
type: "Identifier"
name: "foo"
generator: false
async: false
curry: true ◀︎ look at this 🔔!!
params:
- type: "Identifier"
name: "a"
- type: "Identifier"
name: "b"
- type: "Identifier"
name: "c"
body:
type: "BlockStatement"
body:
- type: "ReturnStatement"
argument:
type: "BinaryExpression"
left:
type: "BinaryExpression"
left:
type: "Identifier"
name: "a"
operator: "+"
right:
type: "Identifier"
name: "b"
operator: "+"
right:
type: "Identifier"
name: "c"
directives: []
Notice the curry: true
attribute in the AST marking the function as one to be curried during the subsequent transformation phases.
I advise you to do the same while you are learning.
Important
The make bootstrap
and make build
are unbearable slow!
Remember: You have to issue a make build
them every time you change to a branch with a new version
of your parser.
One way to overcome this is to use git worktree add
to have a working space for each branch. A git repository can support multiple working trees, allowing you to check out more than one branch at a time. With git worktree add
a new working tree is associated with the repository. This new working tree is called a "linked working tree" as opposed to the "main working tree" prepared by "git init
" or "git clone
". A repository has one main working tree and zero or more linked working trees. When you are done with a linked working tree, remove it with git worktree remove
.
➜ babel-tanhauhau git:(learning) ✗ git worktree add ../babel-tanhauhau-feat-curry-function feat/curry-function
HEAD está ahora en b793efad1 function hoisting
If you have several branches holding different versions of the compiler, you can create a working tree for each branch:
➜ babel-tanhauhau git:(learning) ✗ git worktree add ../babel-tanhauhau-adrian adrian
➜ babel-tanhauhau git:(learning) ✗ git worktree add ../babel-tanhauhau-pablo pablo
➜ babel-tanhauhau git:(learning) ✗ ls -l ../ | grep babel-tanhauhau
drwxr-xr-x 42 casianorodriguezleon staff 1344 5 nov 11:24 babel-tanhauhau
drwxr-xr-x@ 39 casianorodriguezleon staff 1248 5 nov 11:43 babel-tanhauhau-adrian
drwxr-xr-x 37 casianorodriguezleon staff 1184 12 jun 13:52 babel-tanhauhau-feat-curry-function
drwxr-xr-x@ 39 casianorodriguezleon staff 1248 5 nov 11:43 babel-tanhauhau-pablo
Now we have among others, a new working tree in the babel-tanhauhau-feat-curry-function
folder.
We can switch to it and set the Babel project there:
➜ babel-tanhauhau git:(learning) ✗ cd ../babel-tanhauhau-feat-curry-function
➜ babel-tanhauhau-feat-curry-function git:(feat/curry-function) nvm use default
Now using node v20.5.0 (npm v9.8.0)
➜ babel-tanhauhau git:(feat/curry-function) ✗ make bootstrap
➜ babel-tanhauhau git:(feat/curry-function) ✗ make build
Now we can switch between the two workspaces at no cost.
➜ compiler-learning pwd -P
/Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning
➜ compiler-learning ls -l | grep babel
480 27 sep 2023 ast-traversal-babel
71 28 may 13:51 babel-learning -> ~/campus-virtual/2324/learning/babel-learning
41 10 jun 13:25 babel-tanhauhau
36 10 jun 13:45 babel-tanhauhau-feat-curry-function
Npw that we have Tan's Babel version in the babel-tanhauhau-feat-curry-function
folder, we can make a symbolic link to Tan's version of the babel
executable and run it with npx
:
➜ babel-learning git:(main) ln -s /Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau-feat-curry-function/packages/babel-cli/bin/babel.js node_modules/.bin/tanbabel
➜ babel-learning git:(main) npx tanbabel --version
7.6.0 (@babel/core 7.6.0)
Let us also make a symbolic link to Tan's version of Babel:
➜ babel-learning git:(main) ln -s /Users/casianorodriguezleon/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau-feat-curry-function
➜ babel-learning git:(main) ✗ ls -tr | tail -1
babel-tanhauhau-feat-curry-function
The version of the babel-cli I cloned from Tan Li Hau's repo is:
First, let us install the js-beautify
package:
babel-learning git:(main) npm -g install js-beautify
added 52 packages in 5s
14 packages are looking for funding
run `npm fund` for details
Once you have done
- the symbolic link,
- changed to branch
feat/curry-function
on the Tan's Babel cloned workspace and - run
make bootstrap
andmake build
,
you can make use of the babel-tanhauhau/packages/babel-cli/bin/babel.js
cli to transform the code in the babel-learning/src/tan-liu-article/example.js
folder like follows:
➜ babel-learning git:(main) ✗ babel-tanhauhau-feat-curry-function/packages/babel-cli/bin/babel.js src/tan-liu-article/example.js --plugins=./babel-tanhauhau-feat-curry-function/packages/babel-plugin-transform-curry-function | js-beautify -
or better, alternatively:
➜ babel-learning git:(main) ✗ npx tanbabel src/tan-liu-article/example.js --plugins=./babel-tanhauhau-feat-curry-function/packages/babel-plugin-transform-curry-function | js-beautify -
// '@@' makes the function `foo` curried
const foo = _currying(function(a, b, c) {
return a + b + c;
});
function _currying(fn) {
const numParamsRequired = fn.length;
function curryFactory(params) {
return function(...args) {
const newParams = params.concat(args);
if (newParams.length >= numParamsRequired) {
return fn(...newParams);
}
return curryFactory(newParams);
};
}
return curryFactory([]);
}
console.log(foo(1, 2)(3)); // 6
We can even pipe the output to node and see it running!
➜ babel-learning git:(main) babel-tanhauhau/packages/babel-cli/bin/babel.js \
src/tan-liu-article/example.js \
--plugins=./babel-tanhauhau/packages/babel-plugin-transform-curry-function | node
6
Babel uses a monorepo structure, all the packages, eg:
@babel/core
,
@babel/parser
,
@babel/plugin-transform-react-jsx
,
etc
are in the packages/
folder:
➜ babel-tanhauhau git:(master) tree -aL 1
.
├── Gulpfile.js
├── Makefile
├── README.md
├── babel.config.js
├── codemods
├── doc
├── jest.config.js
├── lerna.json
├── lib
├── node_modules
├── package.json
└── packages
└──
├── babel-cli
├── babel-generator
├── babel-helper-...
├── babel-node
├── babel-parser
├── babel-plugin-...
├── babel-plugin-proposal-async-generator-functions
├── babel-plugin-syntax-...
├── babel-plugin-syntax-jsx
├── babel-plugin-syntax-typescript
├── babel-plugin-transform-...
├── babel-polyfill
├── babel-preset-env
├── babel-preset-flow
├── babel-preset-react
├── babel-preset-typescript
├── babel-register
├── babel-runtime...
├── babel-standalone
├── babel-template
├── babel-traverse
└── babel-types
├── scripts
└── test
The folder we are going to work on is packages/babel-parser/
:
➜ babel-tanhauhau git:(master) cd packages/babel-parser
➜ babel-parser git:(master) tree -I node_modules -aL 2
.
├── AUTHORS
├── CHANGELOG.md
├── LICENSE
├── README.md
├── ast
│ ├── flow.md
│ ├── jsx.md
│ └── spec.md
├── bin
│ └── babel-parser.js
├── lib
│ └── index.js
├── package.json
├── src
│ ├── index.js
│ ├── options.js
│ ├── parser
│ │ ├── base.js
│ │ ├── comments.js
│ │ ├── error-message.js
│ │ ├── error.js
│ │ ├── expression.js
│ │ ├── index.js
│ │ ├── lval.js
│ │ ├── node.js
│ │ ├── statement.js
│ │ └── util.js
│ ├── plugin-utils.js
│ ├── plugins
│ │ ├── estree.js
│ │ ├── flow.js
│ │ ├── jsx
│ │ ├── placeholders.js
│ │ ├── typescript
│ │ └── v8intrinsic.js
│ ├── tokenizer
│ │ ├── context.js
│ │ ├── index.js
│ │ ├── state.js
│ │ └── types.js
│ ├── types.js
│ └── util
│ ├── class-scope.js
│ ├── identifier.js
│ ├── location.js
│ ├── production-parameter.js
│ ├── scope.js
│ ├── scopeflags.js
│ └── whitespace.js
├── test
│ ├── estree-throws.js
│ ├── expressions
│ ├── expressions.js
│ ├── fixtures
│ ├── helpers
│ ├── index.js
│ ├── plugin-options.js
│ └── unit
└── typings
└── babel-parser.d.ts
14 directories, 19 files
We've talked about tokenization and parsing, now it's clear where to find the code for each process.
plugins/
folder contains plugins that extend the base parser and add custom syntaxes, such asjsx
andflow
.
... and typescript
.
See section /doc/parser/debugging.md to see how to use Chrome to debug the parser.
Let's do a Test-driven development (TDD). I find it easier to define the test case then slowly work our way to "fix" it. It is especially true in an unfamiliar codebase, TDD allows you to "easily" point out code places you need to change.
I copy the test file packages/babel-parser/test/curry-function.js
from the article:
➜ babel-parser git:(master) ✗ cat test/curry-function.js
import { parse } from '../lib';
function getParser(code) {
return () => parse(code, { sourceType: 'module' });
}
describe('curry function syntax', function() {
it('should parse', function() {
expect(getParser(`function @@ foo() {}`)()).toMatchSnapshot();
});
});
The testing seems to be in Jest: toMatchSnapshot
is a Jest function.
See for instance
- The script ULL-ESIT-PL/babel-tanhauhau//scripts/test.sh
- ULL-ESIT-PL/babel-tanhauhau//packages/babel-parser/test/unit/tokenizer/types.js
To run the tests for a package we can use the make test-only
command specifying
- The package with the
TEST_ONLY
environment variable and - To run only those tests whose description matches the
TEST_GREP
environment variable
➜ babel-tanhauhau git:(master) ✗ TEST_ONLY=babel-parser TEST_GREP="token types" make test-only
BABEL_ENV=test ./scripts/test.sh
PASS packages/babel-parser/test/unit/tokenizer/types.js
Test Suites: 7 skipped, 1 passed, 1 of 8 total
Tests: 5253 skipped, 3 passed, 5256 total
Snapshots: 0 total
Time: 7.01s
Ran all test suites matching /(packages|codemods|eslint)\/.*babel-parser.*\/test/i with tests matching "token types".
/Applications/Xcode.app/Contents/Developer/usr/bin/make test-clean
rm -rf packages/*/test/tmp
rm -rf packages/*/test-fixtures.json
rm -rf codemods/*/test/tmp
rm -rf codemods/*/test-fixtures.json
rm -rf eslint/*/test/tmp
rm -rf eslint/*/test-fixtures.json
The index.js
file in the lib
folder exports an object with
parse
,parseExpression
andtokTypes
properties
➜ babel-tanhauhau git:(master) ✗ cd packages/babel-parser
➜ babel-parser git:(master) ✗ node
> B = require("./lib")
{
parse: [Function: parse],
parseExpression: [Function: parseExpression],
tokTypes: [Getter]
}
We can get the AST for the code 1
with B.parseExpression("1")
. The AST spec is at packages/babel-parser/ast/spec.md:
> B.parseExpression("1")
Node {
type: 'NumericLiteral',
start: 0,
end: 1,
loc: SourceLocation {
start: Position { line: 1, column: 0 },
end: Position { line: 1, column: 1 }
},
extra: { rawValue: 1, raw: '1' },
value: 1,
comments: [],
errors: []
}
The tokTypes
property is a getter that returns an object with the token types:
> B.tokTypes.num
TokenType {
label: 'num',
keyword: undefined,
beforeExpr: false,
startsExpr: true,
rightAssociative: false,
isLoop: false,
isAssign: false,
prefix: false,
postfix: false,
binop: null,
updateContext: null
}
> B.tokTypes.exponent
TokenType {
label: '**',
keyword: undefined,
beforeExpr: true,
startsExpr: false,
rightAssociative: true,
isLoop: false,
isAssign: false,
prefix: false,
postfix: false,
binop: 11,
updateContext: null
}
> B.tokTypes.star
TokenType {
label: '*',
keyword: undefined,
beforeExpr: true,
startsExpr: false,
rightAssociative: false,
isLoop: false,
isAssign: false,
prefix: false,
postfix: false,
binop: 10,
updateContext: [Function (anonymous)]
}
> B.tokTypes.plusMin
TokenType {
label: '+/-',
keyword: undefined,
beforeExpr: true,
startsExpr: true,
rightAssociative: false,
isLoop: false,
isAssign: false,
prefix: true,
postfix: false,
binop: 9,
updateContext: null
}
> B.tokTypes.incDec
TokenType {
label: '++/--',
keyword: undefined,
beforeExpr: false,
startsExpr: true,
rightAssociative: false,
isLoop: false,
isAssign: false,
prefix: true,
postfix: true,
binop: null,
updateContext: [Function (anonymous)]
}
We run make test-only
from the root of the project:
➜ babel-parser git:(master) ✗ TEST_ONLY=babel-parser TEST_GREP="curry function" make test-only
make: *** No rule to make target `test-only'. Stop.
➜ babel-parser git:(master) ✗ cd ..
➜ packages git:(master) ✗ cd ..
➜ babel-tanhauhau git:(master) ✗ TEST_ONLY=babel-parser TEST_GREP="curry function" make test-only
BABEL_ENV=test ./scripts/test.sh
FAIL packages/babel-parser/test/curry-function.js
● curry function syntax › should parse
SyntaxError: Unexpected token (1:9)
752 |
753 | _raise(errorContext, message) {
> 754 | const err = new SyntaxError(message);
| ^
755 | Object.assign(err, errorContext);
756 |
757 | if (this.options.errorRecovery) {
at Parser._raise (packages/babel-parser/lib/index.js:754:17)
at Parser.raiseWithData (packages/babel-parser/lib/index.js:747:17)
at Parser.raise (packages/babel-parser/lib/index.js:741:17)
at Parser.unexpected (packages/babel-parser/lib/index.js:8844:16)
at Parser.parseIdentifierName (packages/babel-parser/lib/index.js:10863:18)
at Parser.parseIdentifier (packages/babel-parser/lib/index.js:10840:23)
at Parser.parseFunctionId (packages/babel-parser/lib/index.js:11927:55)
at Parser.parseFunction (packages/babel-parser/lib/index.js:11893:22)
at Parser.parseFunctionStatement (packages/babel-parser/lib/index.js:11542:17)
at Parser.parseStatementContent (packages/babel-parser/lib/index.js:11234:21)
Test Suites: 1 failed, 7 skipped, 1 of 8 total
Tests: 1 failed, 5255 skipped, 5256 total
Snapshots: 0 total
Time: 6.598s, estimated 11s
Ran all test suites matching /(packages|codemods|eslint)\/.*babel-parser.*\/test/i with tests matching "curry function".
make: *** [test-only] Error 1
The environment variables TEST_ONLY=babel-parser TEST_GREP="curry function"
set the test to run only the babel-parser
tests and to grep for the string curry function
in the test description.
The same thing happens when I run the test using jest
:
➜ babel-tanhauhau git:(master) ✗ BABEL_ENV=test npx jest -u packages/babel-parser/test/curry-function.js
FAIL packages/babel-parser/test/curry-function.js
curry function syntax
✕ should parse (6ms)
● curry function syntax › should parse
SyntaxError: Unexpected token (1:9)
752 |
753 | _raise(errorContext, message) {
> 754 | const err = new SyntaxError(message);
| ^
755 | Object.assign(err, errorContext);
756 |
757 | if (this.options.errorRecovery) {
at Parser._raise (packages/babel-parser/lib/index.js:754:17)
at Parser.raiseWithData (packages/babel-parser/lib/index.js:747:17)
at Parser.raise (packages/babel-parser/lib/index.js:741:17)
at Parser.unexpected (packages/babel-parser/lib/index.js:8844:16)
at Parser.parseIdentifierName (packages/babel-parser/lib/index.js:10863:18)
at Parser.parseIdentifier (packages/babel-parser/lib/index.js:10840:23)
at Parser.parseFunctionId (packages/babel-parser/lib/index.js:11927:55)
at Parser.parseFunction (packages/babel-parser/lib/index.js:11893:22)
at Parser.parseFunctionStatement (packages/babel-parser/lib/index.js:11542:17)
at Parser.parseStatementContent (packages/babel-parser/lib/index.js:11234:21)
Test Suites: 1 failed, 1 total
Tests: 1 failed, 1 total
Snapshots: 0 total
Time: 1.273s
Ran all test suites matching /packages\/babel-parser\/test\/curry-function.js/i.
The environment variable BABEL_ENV=test
is used to set the environment to test.
Our parser found 2 seemingly innocent
@
tokens at a place where they shouldn't be present.
How do I know that? Let's start the watch mode,
make watch
, wear our detective cap 🕵️ and start digging!
You can access the built files for individual packages from
packages/<package-name>/lib
.
First: the babel command line script has a -w --watch
option that allows us to watch the files and rebuild the project incrementally.
See https://www.npmjs.com/package/babel-watch.
In the Makefile
we find this task watch
:
watch: build-no-bundle
BABEL_ENV=development $(YARN) gulp watch
and in the Gulpfile.js
we find the watch
task:
gulp.task(
"watch",
gulp.series("build-no-bundle", function watch() {
gulp.watch(defaultSourcesGlob, gulp.task("build-no-bundle"));
})
);
The target make watch
allow us to have Babel build itself and incrementally build files on change. This way
we can see the changes we are going to do in the tokenizer and parser withour having to re-build the whole project.
When I do make watch
I see the following output:
➜ babel-tanhauhau git:(master) ✗ make watch
rm -rf packages/*/test/tmp
rm -rf packages/*/test-fixtures.json
rm -rf codemods/*/test/tmp
rm -rf codemods/*/test-fixtures.json
rm -rf eslint/*/test/tmp
rm -rf eslint/*/test-fixtures.json
rm -f .npmrc
rm -rf packages/babel-polyfill/browser*
rm -rf packages/babel-polyfill/dist
rm -rf coverage
rm -rf packages/*/npm-debug*
rm -rf packages/*/lib
rm -rf codemods/*/lib
rm -rf eslint/*/lib
BABEL_ENV=development yarn --silent gulp build-no-bundle
[16:25:24] Using gulpfile ~/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau/gulpfile.js
[16:25:24] Starting 'build-no-bundle'...
[16:25:25] Compiling 'codemods/babel-plugin-codemod-object-assign-to-object-spread/src/index.js'...
[16:25:25] Compiling 'codemods/babel-plugin-codemod-optional-catch-binding/src/index.js'...
@babel/preset-env: `DEBUG` option
Using targets:
{
"node": "21.2"
}
Using modules transform: false
Using plugins:
syntax-numeric-separator { "node":"21.2" }
proposal-class-properties { "node":"21.2" }
proposal-private-methods { "node":"21.2" }
syntax-nullish-coalescing-operator { "node":"21.2" }
syntax-optional-chaining { "node":"21.2" }
syntax-json-strings { "node":"21.2" }
syntax-optional-catch-binding { "node":"21.2" }
syntax-async-generators { "node":"21.2" }
syntax-object-rest-spread { "node":"21.2" }
syntax-dynamic-import { "node":"21.2" }
Using polyfills: No polyfills were added, since the `useBuiltIns` option was not set.
[16:25:25] Compiling 'eslint/babel-eslint-parser/src/analyze-scope.js'...
... # hundreds of lines of "Compiling" messages
[16:25:32] Compiling 'packages/babel-types/src/validators/react/isCompatTag.js'...
[16:25:32] Compiling 'packages/babel-types/src/validators/react/isReactComponent.js'...
[16:25:32] Finished 'build-no-bundle' after 7.76 s
# Ensure that build artifacts for types are created during local
# development too.
/Applications/Xcode.app/Contents/Developer/usr/bin/make generate-type-helpers
yarn --silent node packages/babel-types/scripts/generateTypeHelpers.js
Generating @babel/types dynamic functions
✔ Generated builders
✔ Generated validators
✔ Generated asserts
✔ Generated constants
/Applications/Xcode.app/Contents/Developer/usr/bin/make build-typings
yarn --silent node packages/babel-types/scripts/generators/flow.js > packages/babel-types/lib/index.js.flow
yarn --silent node packages/babel-types/scripts/generators/typescript.js > packages/babel-types/lib/index.d.ts
BABEL_ENV=development yarn --silent gulp watch
[16:25:35] Using gulpfile ~/campus-virtual/2122/learning/compiler-learning/babel-tanhauhau/gulpfile.js
[16:25:35] Starting 'watch'...
[16:25:35] Starting 'build-no-bundle'...
[16:25:35] Compiling 'packages/babel-types/src/asserts/generated/index.js'...
@babel/preset-env: `DEBUG` option
Using targets:
{
"node": "21.2"
}
Using modules transform: false
Using plugins:
syntax-numeric-separator { "node":"21.2" }
proposal-class-properties { "node":"21.2" }
proposal-private-methods { "node":"21.2" }
syntax-nullish-coalescing-operator { "node":"21.2" }
syntax-optional-chaining { "node":"21.2" }
syntax-json-strings { "node":"21.2" }
syntax-optional-catch-binding { "node":"21.2" }
syntax-async-generators { "node":"21.2" }
syntax-object-rest-spread { "node":"21.2" }
syntax-dynamic-import { "node":"21.2" }
Using polyfills: No polyfills were added, since the `useBuiltIns` option was not set.
[16:25:36] Compiling 'packages/babel-types/src/builders/generated/index.js'...
[16:25:36] Compiling 'packages/babel-types/src/constants/generated/index.js'...
[16:25:36] Compiling 'packages/babel-types/src/validators/generated/index.js'...
[16:25:36] Finished 'build-no-bundle' after 1.19 s
[16:25:36] Starting 'watch'...
And it hangs here waiting for any of the Babel source files to change and rebuilding the compiler when need it.
Now we run the test again:
➜ babel-tanhauhau git:(master) ✗ TEST_ONLY=babel-parser TEST_GREP="curry function" make test-only
BABEL_ENV=test ./scripts/test.sh
FAIL packages/babel-parser/test/curry-function.js
● curry function syntax › should parse
SyntaxError: Unexpected token (1:9)
41 |
42 | _raise(errorContext, message) {
> 43 | const err = new SyntaxError(message);
| ^
44 | Object.assign(err, errorContext);
45 |
46 | if (this.options.errorRecovery) {
at Parser._raise (packages/babel-parser/lib/parser/error.js:43:17)
at Parser.raiseWithData (packages/babel-parser/lib/parser/error.js:36:17)
at Parser.raise (packages/babel-parser/lib/parser/error.js:30:17)
at Parser.unexpected (packages/babel-parser/lib/parser/util.js:109:16)
at Parser.parseIdentifierName (packages/babel-parser/lib/parser/expression.js:1515:18) <--- here
at Parser.parseIdentifier (packages/babel-parser/lib/parser/expression.js:1492:23)
at Parser.parseFunctionId (packages/babel-parser/lib/parser/statement.js:847:63)
at Parser.parseFunction (packages/babel-parser/lib/parser/statement.js:813:22)
at Parser.parseFunctionStatement (packages/babel-parser/lib/parser/statement.js:462:17)
at Parser.parseStatementContent (packages/babel-parser/lib/parser/statement.js:154:21)
Test Suites: 1 failed, 7 skipped, 1 of 8 total
Tests: 1 failed, 5255 skipped, 5256 total
Snapshots: 0 total
Time: 8.269s, estimated 11s
Ran all test suites matching /(packages|codemods|eslint)\/.*babel-parser.*\/test/i with tests matching "curry function".
make: *** [test-only] Error 1
Tracing the stack trace, led us to
packages/babel-parser/src/parser/expression.js
where it throwsthis.unexpected()
.
Correct! See the message ... at Parser.parseIdentifierName (packages/babel-parser/lib/parser/expression.js:1515:18)
in the stack trace above.
Let us add some
console.log
:
Tan Li proposes to go to file packages/babel-parser/src/parser/expression.js
and add some console.log
to see what is happening.
parseIdentifierName(pos: number, liberal?: boolean): string {
if (this.match(tt.name)) {
// ...
} else {
console.log(this.state.type); // current token
console.log(this.lookahead().type); // next token
throw this.unexpected();
}
}
How do I know
this.state.type
andthis.lookahead().type
will give me the current and the next token?
Well, I'll explained them later.
Let's recap what we've done so far before we move on:
- We've written a test case for babel-parser
- We ran
make test-only
to run the test case- We've started the watch mode via
make watch
- We've learned about parser state, and console out the current token type,
this.state.type
Here is the full code of the previous version of the function:
parseIdentifierName(pos: number, liberal?: boolean): string {
let name: string;
if (this.match(tt.name)) {
name = this.state.value;
} else if (this.state.type.keyword) {
name = this.state.type.keyword;
// `class` and `function` keywords push function-type token context into this.context.
// But there is no chance to pop the context if the keyword is consumed
// as an identifier such as a property name.
const context = this.state.context;
if (
(name === "class" || name === "function") &&
context[context.length - 1].token === "function"
) {
context.pop();
}
} else {
throw this.unexpected();
}
if (liberal) {
// If the current token is not used as a keyword, set its type to "tt.name".
// This will prevent this.next() from throwing about unexpected escapes.
this.state.type = tt.name;
} else {
this.checkReservedWord(
name,
this.state.start,
!!this.state.type.keyword,
false,
);
}
this.next();
return name;
}
Next to the function I've got several warnings in VSCode stating that "type annotations can only be used in typescript". The solution adopted is described in section doc/vscode-typescript-config.md.
So I included the code above in the function parseIdentifierName
in the file packages/babel-parser/src/parser/expression.js
and watched the make watch
terminal sending the warnings about compiling the files that I have changed:
Using polyfills: No polyfills were added, since the `useBuiltIns` option was not set.
[16:25:36] Compiling 'packages/babel-types/src/builders/generated/index.js'...
[16:25:36] Compiling 'packages/babel-types/src/constants/generated/index.js'...
[16:25:36] Compiling 'packages/babel-types/src/validators/generated/index.js'...
[16:25:36] Finished 'build-no-bundle' after 1.19 s
[16:25:36] Starting 'watch'...
[19:40:45] Starting 'build-no-bundle'...
[19:40:45] Compiling 'packages/babel-parser/src/parser/expression.js'...
[19:40:46] Finished 'build-no-bundle' after 683 ms
[19:40:46] Starting 'build-no-bundle'...
[19:40:46] Finished 'build-no-bundle' after 187 ms
[19:41:08] Starting 'build-no-bundle'...
[19:41:08] Compiling 'packages/babel-parser/src/parser/expression.js'...
[19:41:08] Finished 'build-no-bundle' after 474 ms
Now, when I run the tests again, I get the following output:
➜ babel-tanhauhau git:(master) ✗ TEST_ONLY=babel-parser TEST_GREP="curry function" make test-only
BABEL_ENV=test ./scripts/test.sh
FAIL packages/babel-parser/test/curry-function.js
● Console
console.error packages/babel-parser/lib/parser/expression.js:1515
TokenType {
label: '@',
keyword: undefined,
beforeExpr: false,
startsExpr: false,
rightAssociative: false,
isLoop: false,
isAssign: false,
prefix: false,
postfix: false,
binop: null,
updateContext: null
}
console.error packages/babel-parser/lib/parser/expression.js:1516
TokenType {
label: '@',
keyword: undefined,
beforeExpr: false,
startsExpr: false,
rightAssociative: false,
isLoop: false,
isAssign: false,
prefix: false,
postfix: false,
binop: null,
updateContext: null
}
● curry function syntax › should parse
SyntaxError: Unexpected token (1:9)
...
As you can see, both tokens are
@
token:
TokenType {
label: '@',
// ...
}
We can also make a standalone execution of the Babel parser
Here's what we are going to do next:
If there's 2 consecutive
@
, it should not be separate tokens, it should be a@@
token, the new token we just defined for our curry function
Let's first look at where a token type is defined: packages/babel-parser/src/tokenizer/types.js.
Here you see a list of tokens, so let's add our new token definition in as well:
export const types: { [name: string]: TokenType } = { num: new TokenType("num", { startsExpr }), bigint: new TokenType("bigint", { startsExpr }), regexp: new TokenType("regexp", { startsExpr }), string: new TokenType("string", { startsExpr }), name: new TokenType("name", { startsExpr }), eof: new TokenType("eof"), ... at: new TokenType("@"), atat: new TokenType('@@'), hash: new TokenType("#", { startsExpr }), ... };
By calling the constructor we are setting the label
property of the token atat
to @@
Next, let's find out where the token gets created during tokenization. A quick search for
tt.at
withinbabel-parser/src/tokenizer
lead us to packages/babel-parser/src/tokenizer/index.js
Here is the general structure of the code of the getTokenFromCode
function inside
the babel-parser/src/tokenizer/index.js
file:
...
import * as charCodes from "charcodes";
import { types as tt, keywords as keywordTypes, type TokenType } from "./types";
...
export default class Tokenizer extends ParserErrors {
...
getTokenFromCode(code: number): void {
switch (code) {
// The interpretation of a dot depends on whether it is followed
// by a digit or another two dots.
case charCodes.dot:
this.readToken_dot();
return;
...
case charCodes.atSign:
++this.state.pos;
this.finishToken(tt.at);
return;
case charCodes.numberSign:
this.readToken_numberSign();
return;
...
default:
if (isIdentifierStart(code)) {
this.readWord();
return;
}
}
throw this.raise(
this.state.pos,
Errors.InvalidOrUnexpectedToken,
String.fromCodePoint(code),
);
}
The Babel parser uses charcodes constants to represent characters.
Well, token types are import as
tt
throughout the babel-parser.
Let's create the token
tt.atat
instead oftt.at
if there's another@
after the current@
:
case charCodes.atSign: // if the next character is a `@` if (this.input.charCodeAt(this.state.pos + 1) === charCodes.atSign) { // create `tt.atat` instead this.finishOp(tt.atat, 2); } else { this.finishOp(tt.at, 1); } return;
The function finishOp
receives the token type and the size of the token, sets the token value and advances the position by calling finishToken
finishOp(type: TokenType, size: number): void {
const str = this.input.slice(this.state.pos, this.state.pos + size);
this.state.pos += size;
this.finishToken(type, str);
}
If you run the test again, you will see that the current token and the next token has changed:
➜ babel-tanhauhau git:(learning) ✗ TEST_ONLY=babel-parser TEST_GREP="curry function" make test-only
BABEL_ENV=test ./scripts/test.sh
FAIL packages/babel-parser/test/curry-function.js
● Console
console.error packages/babel-parser/lib/parser/expression.js:1517
TokenType {
label: '@@',
keyword: undefined,
beforeExpr: false,
startsExpr: false,
rightAssociative: false,
isLoop: false,
isAssign: false,
prefix: false,
postfix: false,
binop: null,
updateContext: null
}
console.error packages/babel-parser/lib/parser/expression.js:1518
TokenType {
label: 'name',
keyword: undefined,
beforeExpr: false,
startsExpr: true,
rightAssociative: false,
isLoop: false,
isAssign: false,
prefix: false,
postfix: false,
binop: null,
updateContext: [Function (anonymous)]
}
● curry function syntax › should parse
SyntaxError: Unexpected token (1:9)
Notice that
- I have created the branch
learning
to keep track of the changes I am doing in the code. - The parser fails but now the token has label
@@
Before we move on, let's inspect how generator functions are represented in AST:
➜ babel-learning git:(main) compast -jp 'function* foo() {}' | jq '.body[0]'
{
"type": "FunctionDeclaration",
"id": {
"type": "Identifier",
"name": "foo"
},
"expression": false,
"generator": true,
"async": false,
"params": [],
"body": {
"type": "BlockStatement",
"body": []
}
}
As you can see, a generator function is represented by the
generator: true
attribute of a of theFunctionExpression
or of theFunctionDeclaration
if it is the case.
Similarly, we can add a
curry: true
attribute or theFunctionDeclaration
too if it is acurry
function:
➜ babel-learning git:(main) compast -jp 'function @@ foo() {}' | jq '.body[0]'
{
"type": "FunctionDeclaration",
"id": {
"type": "Identifier",
"name": "foo"
},
"expression": false,
"generator": false,
"curry": true,
"async": false,
"params": [],
"body": {
"type": "BlockStatement",
"body": []
}
}
We have a plan now, let's implement it!.
A quick search on "
FunctionDeclaration
" leads us to a function calledparseFunction
in packages/babel-parser/src/parser/statement.js, and here we find a line that sets thegenerator
attribute, let's add one more line:
packages/babel-parser/src/parser/statement.js
export default class StatementParser extends ExpressionParser {
// ...
parseFunction<T: N.NormalFunction>(
node: T,
statement?: number = FUNC_NO_FLAGS,
isAsync?: boolean = false
): T {
// ...
node.generator = this.eat(tt.star);
node.curry = this.eat(tt.atat);
}
}
If you run the test again, you will be amazed that it passed!
➜ babel-tanhauhau git:(learning) ✗ npx jest -u packages/babel-parser/test/curry-function.js
PASS packages/babel-parser/test/curry-function.js
curry function syntax
✓ should parse (6ms)
Test Suites: 1 passed, 1 total
Tests: 1 passed, 1 total
Snapshots: 1 passed, 1 total
Time: 0.562s, estimated 1s
Ran all test suites matching /packages\/babel-parser\/test\/curry-function.js/i.
That's it? How did we miraculously fix it?
I am going to briefly explain how parsing works, and in the process hopefully, you understood what that one-liner change did.
Although the current version of the Babel parser seems to be rewritten in TypeScript, the version we are using of Babel.js was developed using Flow, a static type checker for JavaScript. The flow-bin
package is a binary wrapper for flow
that makes it easy to use the Flow static type checker from the command line. Although the JS test pass, if we check with flow the file src/index.js
we get an error in the assignment node.curry = this.eat(tt.atat)
at line 1055 of file src/parser/statement.js
complaining that the property curry
is missing in the type NodeBase
or other of the object types:
babel-parser git:(learning) ✗ npx flow check src/index.js
Error ┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈ src/parser/statement.js:1055:5
Cannot assign this.eat(...) to node.curry because:
• Either property curry is missing in NodeBase [1].
• Or property curry is missing in object type [2].
• Or property curry is missing in object type [3].
• Or property curry is missing in object type [4].
• Or property curry is missing in object type [5].
• Or property curry is missing in NodeBase [6].
• Or property curry is missing in object type [7].
• Or property curry is missing in object type [8].
• Or property curry is missing in object type [9].
src/parser/statement.js
1052│ this.raise(this.state.start, Errors.GeneratorInSingleStatementContext);
1053│ }
1054│ node.generator = this.eat(tt.star);
1055│ node.curry = this.eat(tt.atat);
1056│
1057│ if (isStatement) {
1058│ node.id = this.parseFunctionId(requireId);
src/types.js
[6][7] 60│ export type DeclarationBase = NodeBase & {
61│ // TypeScript allows declarations to be prefixed by `declare`.
62│ //TODO: a FunctionDeclaration is never "declare", because it's a TSDeclareFunction instead.
63│ declare?: true,
64│ };
65│
66│ // TODO: Not in spec
[1][2] 67│ export type HasDecorators = NodeBase & {
68│ decorators?: $ReadOnlyArray<Decorator>,
69│ };
:
[3] 161│ export type BodilessFunctionOrMethodBase = HasDecorators & {
162│ // TODO: Remove this. Should not assign "id" to methods.
163│ // https://github.com/babel/babylon/issues/535
164│ id: ?Identifier,
165│
166│ params: $ReadOnlyArray<Pattern | TSParameterProperty>,
167│ body: BlockStatement,
168│ generator: boolean,
169│ async: boolean,
170│
171│ // TODO: All not in spec
172│ expression: boolean,
173│ typeParameters?: ?TypeParameterDeclarationBase,
174│ returnType?: ?TypeAnnotationBase,
175│ };
176│
[4] 177│ export type BodilessFunctionBase = BodilessFunctionOrMethodBase & {
178│ id: ?Identifier,
179│ };
180│
[5] 181│ export type FunctionBase = BodilessFunctionBase & {
182│ body: BlockStatement,
183│ };
:
[8] 322│ DeclarationBase & {
323│ type: "FunctionDeclaration",
324│ };
325│
[9] 326│ export type FunctionDeclaration = OptFunctionDeclaration & {
327│ id: Identifier,
328│ };
If we follow the instructions and modify the src/types.js
file to include the property curry
in the type
BodilessFunctionOrMethodBase
:
➜ babel-parser git:(learning) ✗ git -P diff src/types.js
diff --git a/packages/babel-parser/src/types.js b/packages/babel-parser/src/types.js
index 17f96dc49..802986c4a 100644
--- a/packages/babel-parser/src/types.js
+++ b/packages/babel-parser/src/types.js
@@ -167,6 +167,7 @@ export type BodilessFunctionOrMethodBase = HasDecorators & {
body: BlockStatement,
generator: boolean,
async: boolean,
+ curry: boolean, // TODO: Not in spec
// TODO: All not in spec
expression: boolean,
the error disappears.
➜ babel-parser git:(learning) ✗ npx flow check src/index.js
Found 0 errors
You can also run make flow
:
➜ babel-tanhauhau git:(learning) ✗ make flow
yarn --silent flow check --strip-root
Found 0 errors
See the issue [Discussion] Remove flow support from @babel/parser #16264
With the list of tokens from the tokenizer, the parser consumes the token one by one and constructs the AST. The parser uses the language grammar specification to decide how to use the tokens, which token to expect next.
The grammar specification looks something like this:
... ExponentiationExpression -> UnaryExpression UpdateExpression ** ExponentiationExpression MultiplicativeExpression -> ExponentiationExpression MultiplicativeExpression ("*" or "/" or "%") ExponentiationExpression AdditiveExpression -> MultiplicativeExpression AdditiveExpression + MultiplicativeExpression AdditiveExpression - MultiplicativeExpression ...
It explains the precedence of each expressions/statements. For example, an
AdditiveExpression
is made up of either:
- a
MultiplicativeExpression
, or- an
AdditiveExpression
followed by+
operator token followed byMultiplicativeExpression
, or- an
AdditiveExpression
followed by-
operator token followed byMultiplicativeExpression
.
With these rules, we translate them into parser code:
class Parser { // ... parseAdditiveExpression() { const left = this.parseMultiplicativeExpression(); // if the current token is `+` or `-` if (this.match(tt.plus) || this.match(tt.minus)) { const operator = this.state.type; // move on to the next token this.nextToken(); const right = this.parseMultiplicativeExpression(); // create the node this.finishNode( { operator, left, right, }, 'BinaryExpression' ); } else { // return as MultiplicativeExpression return left; } } }
This is a made-up code that oversimplifies what babel have, but I hope you get the gist of it.
As you can see here, the parser is recursively in nature, and it goes from the lowest precedence to the highest precedence expressions/statements. Eg:
parseAdditiveExpression
callsparseMultiplicativeExpression
, which in turn callsparseExponentiationExpression
, which in turn calls ... . This recursive process is called Recursive Descent Parsing.
If you have noticed, in my examples above, I used some utility function, such as
this.eat
,this.match
,this.next
, etc. These are babel parser's internal functions, yet they are quite ubiquitous amongst parsers as well:
this.match
returns a boolean indicating whether the current token matches the conditionthis.next
moves the token list forward to point to the next tokenthis.eat
return whatthis.match
returns and ifthis.match
returns true, will dothis.next
this.eat
is commonly used for optional operators, like*
in generator function,;
at the end of statements, and?
in typescript types.
this.lookahead
get the next token without moving forward to make a decision on the current node
If you take a look again the parser code we just changed, it's easier to read it in now.
packages/babel-parser/src/parser/statement.js
export default class StatementParser extends ExpressionParser {
parseStatementContent(/* ...*/) {
// ...
// NOTE: we call match to check the current token
if (this.match(tt._function)) {
this.next();
// NOTE: function statement has a higher precendence than a generic statement
this.parseFunction();
}
}
// ...
parseFunction(/* ... */) {
// NOTE: we call eat to check whether the optional token exists
node.generator = this.eat(tt.star);
node.curry = this.eat(tt.atat);
node.id = this.parseFunctionId();
}
}
Side Note: You might be curious how am I able to visualize the custom syntax in the Babel AST Explorer, where I showed you the new "
curry
" attribute in the AST.
That's because I've added a new feature in the Babel AST Explorer (not the AST Explorer, but the one of Tan Li Hau) where you can upload your custom parser!
➜ babel-tanhauhau git:(learning) ✗ ls packages/babel-parser/lib
index.js options.js parser plugin-utils.js plugins tokenizer types.js util
If you go to
packages/babel-parser/lib
, you would find the compiled version of your parser and the source map. Open the drawer of the Babel AST Explorer, you will see a button toupload a custom parser
. Drag thepackages/babel-parser/lib/index.js
in and you will be visualizing the AST generated via your custom parser!
With our custom babel parser done, let's move on to write our babel plugin.
But maybe before that, you may have some doubts on how are we going to use our custom babel parser, especially with whatever build stack we are using right now?
Well, fret not. A babel plugin can provide a custom parser, which is documented on the babel website
babel-plugin-transformation-curry-function.js
import customParser from './custom-parser';
export default function ourBabelPlugin() {
return {
parserOverride(code, opts) {
return customParser.parse(code, opts);
},
};
}
Since we forked out the babel parser, all existing babel parser options or built-in plugins will still work perfectly.
With this doubt out of the way, let see how we can make our curry function curryable?3 (not entirely sure there's such word)
Before we start, if you have eagerly tried to add our plugin into your build system, you would notice that the curry function gets compiled to a normal function.
This is because, after parsing + transformation, babel will use @babel/generator to generate code from the transformed AST. Since the
@babel/generator
has no idea about the newcurry
attribute we added, it will be omitted.
Ok, to make our function curryable, we can wrap it with a currying helper higher-order function4:
File src/tan-liu-article/currying/index.js in the repo
ULL-ESIT-PL/babel-learning
function currying(fn) { const numParamsRequired = fn.length; function curryFactory(params) { return function (...args) { const newParams = params.concat(args); if (newParams.length >= numParamsRequired) { return fn(...newParams); } return curryFactory(newParams); } } return curryFactory([]); }
If you want to learn how to write a currying function5, you can read this Currying in JS by Shirsh Zibbu
So when we transform our curry function, we can transform it into the following:
// from function @@ foo(a, b, c) { return a + b + c; } // to const foo = currying(function foo(a, b, c) { return a + b + c; })
Let's first ignore function hoisting in JavaScript, where you can call
foo
before it is defined.If you have read my step-by-step guide on babel transformation, writing this transformation should be manageable:
packages/babel-plugin-transform-curry-function/src/index.js
export default function ourBabelPlugin() { return { // ... visitor: { FunctionDeclaration(path) { if (path.get('curry').node) { // const foo = curry(function () { ... }); path.node.curry = false; path.replaceWith( t.variableDeclaration('const', [ t.variableDeclarator( t.identifier(path.get('id.name').node), t.callExpression(t.identifier('currying'), [ t.toExpression(path.node), ]) ), ]) ); } }, }, }; }The question is how do we provide the
currying
function?
I believe the question Tan is posing here is how to provide the currying
function so that will be available when the transformation runs. That is, how to introduce it in the "Babel run time support"
There are 2 ways:
This approach to test the plugin is explained in section /doc/tan-liu-article/plugin-first-approach.md
See section /doc/tan-liu-article/plugin-second-approach.md for the implementation of the second approach.
Add a package.json
to the plugin.
See https://github.com/ULL-ESIT-PL/babel-tanhauhau/blob/learning/packages/babel-plugin-transform-curry-function/package.json
The steps we've gone through above is similar to part of the TC39 proposal process when defining a new JavaScript specification. When proposing a new specification, the champion6 of the proposal usually write polyfills or forked out babel to write proof-of-concept demos. As you've seen, forking a parser or writing polyfills is not the hardest part of the process, but to define the problem space, plan and think through the use cases and edge cases, and gather opinions and suggestions from the community. To this end, I am grateful to the proposal champions, for their effort in pushing the JavaScript language forward.
Finally, if you want to see the code we've done so far in a full picture, you can check it out from Github.
See also branch curry-function in Tan Li Babel repo.
- Crafting Interpreters by Robert Nystrom
- Free Udacity course: "Compilers: Theory and Practice", offered by Georgia Tech
- Leveling Up One’s Parsing Game With ASTs by Vaidehi Joshi
I would like to thank Tan Li Hau for his awesome work on Babel, his wonderful articles and videos, and for sharing his knowledge with the community.
Back to /README.md (Learning Babel)
Footnotes
-
I am using the
jq '.program.body[0]'
command to select only theFunctionDeclaration
and pretty print the JSON ↩ -
I am using the
compast
command from thehttps://www.npmjs.com/package/compact-js-ast
package to convert the AST toyml
format. ↩ -
ChatGPT says yes! and gives this (perfect) definition: "A function is "curryable" if it can be transformed into a curried version. In other words, it means that the function can be restructured such that it can be invoked with fewer arguments than it expects and returns another function that takes the remaining arguments. ↩
-
It only works for functions with a fixed number of arguments. ↩
-
See example /src/manipulating-ast-with-js/curry/variadic-curry.js ↩
-
The person who is responsible for the ES proposal. Either the champion or a co-champion must be a member of TC39. See https://www.proposals.es/stages/stage1 ↩