Skip to content

Commit

Permalink
BuiltinFunction: Use a generated DAFSA instead of a generated enum
Browse files Browse the repository at this point in the history
A DAFSA (deterministic acyclic finite state automaton) is essentially a trie flattened into an array, but that also uses techniques to minimize redundant nodes. This provides fast lookups while minimizing the required data size, but normally does not allow for associating data related to each word. However, by adding a count of the number of possible words from each node, it becomes possible to also use it to achieve minimal perfect hashing for the set of words (which allows going from word -> unique index as well as unique index -> word). This allows us to store a second array of data so that the DAFSA can be used as a lookup for e.g. BuiltinFunction.

Some resources:
- https://en.wikipedia.org/wiki/Deterministic_acyclic_finite_state_automaton
- https://web.archive.org/web/20220722224703/http://pages.pathcom.com/~vadco/dawg.html
- https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.56.5272 (this is the origin of the minimal perfect hashing technique)
- http://stevehanov.ca/blog/?id=115

What all is going on here:
- `process_builtins.py` was removed in favor of a zig implementation that generates the DAFSA
- The new zig implementation uses the latest .def files from the LLVM repository, so the set of builtins is different and there are some new macro definitions that are not yet supported (e.g. TARGET_BUILTIN), but otherwise the logic matches the python script wherever possible
- The new zig implementation imports `src/builtins/TypeDescription.zig` directly so the calculated max parameter count will always be in sync

The results:

Microbenchmark of lookup time:

    # Does a builtin name exist?:
    
    ------------------- dafsa ------------------
                always found: 88ns per lookup
    not found (random bytes): 33ns per lookup
     not found (1 char diff): 86ns per lookup
    ------------------- enum -------------------
                always found: 3648ns per lookup
    not found (random bytes): 6310ns per lookup
     not found (1 char diff): 3604ns per lookup
    
    dafsa is 41.5x / 191.2x / 41.9x faster
    
    
    # Name -> BuiltinFunction lookup:
    
    ------------------- dafsa ------------------
                always found: 154ns per lookup
    not found (random bytes): 32ns per lookup
     not found (1 char diff): 154ns per lookup
    ------------------- enum -------------------
                always found: 3649ns per lookup
    not found (random bytes): 6317ns per lookup
     not found (1 char diff): 3649ns per lookup
    
    dafsa is 23.7x / 197.4x / 23.7x faster

About 500KiB removed from a ReleaseSmall build of aro:

    1.8M arocc-master
    1.3M arocc-dafsa
    
    $ bloaty arocc-dafsa -- arocc-master
        FILE SIZE        VM SIZE    
     --------------  -------------- 
      +8.1% +22.9Ki  +8.1% +22.9Ki    .data.rel.ro
      -0.1%      -8  -0.1%      -8    .eh_frame_hdr
      -1.4% -2.41Ki  -1.4% -2.41Ki    .data
     -35.5% -49.4Ki -35.5% -49.4Ki    .eh_frame
     -56.8%  -201Ki -56.8%  -201Ki    .rodata
     -30.4%  -258Ki -30.4%  -258Ki    .text
     -26.9%  -488Ki -26.9%  -488Ki    TOTAL

Compile time of `zig build` is reduced by around a second:

    $ hyperfine "git checkout master; zig build" "git checkout dafsa; zig build" --prepare "rm -rf zig-cache" --runs 3 --warmup 1
    Benchmark #1: git checkout master; zig build
      Time (mean ± σ):     11.760 s ±  0.024 s    [User: 11.387 s, System: 0.605 s]
      Range (min … max):   11.745 s … 11.788 s    3 runs

    Benchmark #2: git checkout dafsa; zig build
      Time (mean ± σ):     10.523 s ±  0.016 s    [User: 10.121 s, System: 0.608 s]
      Range (min … max):   10.505 s … 10.534 s    3 runs

    Summary
      'git checkout dafsa; zig build' ran
        1.12 ± 0.00 times faster than 'git checkout master; zig build'

Unit tests run faster:

    Benchmark #1: ./unit-tests-master
      Time (mean ± σ):     224.9 ms ±   2.6 ms    [User: 210.2 ms, System: 14.5 ms]
      Range (min … max):   220.7 ms … 230.1 ms    13 runs

    Benchmark #2: ./unit-tests-dafsa
      Time (mean ± σ):      97.4 ms ±   2.6 ms    [User: 83.1 ms, System: 14.2 ms]
      Range (min … max):    94.2 ms … 103.5 ms    30 runs

    Summary
      './unit-tests-dafsa' ran
        2.31 ± 0.07 times faster than './unit-tests-master'

Record tests run slightly slower (not sure why):

    Benchmark #1: ./record-runner-master test/records
      Time (mean ± σ):     10.283 s ±  0.037 s    [User: 118.281 s, System: 0.261 s]
      Range (min … max):   10.262 s … 10.327 s    3 runs

    Benchmark #2: ./record-runner-dafsa test/records
      Time (mean ± σ):     10.950 s ±  0.209 s    [User: 123.950 s, System: 0.284 s]
      Range (min … max):   10.796 s … 11.187 s    3 runs

    Summary
      './record-runner-master test/records' ran
        1.06 ± 0.02 times faster than './record-runner-dafsa test/records'

Integration test times are ~unchanged:

    Benchmark #1: ./test-runner-master /home/ryan/Programming/zig/arocc/test/cases zig
      Time (mean ± σ):     382.8 ms ±   7.2 ms    [User: 300.9 ms, System: 88.0 ms]
      Range (min … max):   373.2 ms … 393.2 ms    10 runs

    Benchmark #2: ./test-runner-dafsa /home/ryan/Programming/zig/arocc/test/cases zig
      Time (mean ± σ):     392.9 ms ±   3.5 ms    [User: 308.5 ms, System: 90.8 ms]
      Range (min … max):   388.9 ms … 399.7 ms    10 runs

    Summary
      './test-runner-master /home/ryan/Programming/zig/arocc/test/cases zig' ran
        1.03 ± 0.02 times faster than './test-runner-dafsa /home/ryan/Programming/zig/arocc/test/cases zig'
  • Loading branch information
squeek502 committed Oct 14, 2023
1 parent 77e2e6b commit b226d56
Show file tree
Hide file tree
Showing 9 changed files with 13,970 additions and 36,273 deletions.
945 changes: 945 additions & 0 deletions scripts/generate_builtins_dafsa.zig

Large diffs are not rendered by default.

499 changes: 0 additions & 499 deletions scripts/process_builtins.py

This file was deleted.

23 changes: 10 additions & 13 deletions src/Builtins.zig
Original file line number Diff line number Diff line change
Expand Up @@ -272,8 +272,7 @@ fn createBuiltin(comp: *const Compilation, builtin: BuiltinFunction, type_arena:

/// Asserts that the builtin has already been created
pub fn lookup(b: *const Builtins, name: []const u8) Expanded {
@setEvalBranchQuota(10_000);
const builtin = BuiltinFunction.fromTag(std.meta.stringToEnum(BuiltinFunction.Tag, name).?);
const builtin = BuiltinFunction.fromName(name).?;
const ty = b._name_to_type_map.get(name).?;
return .{
.builtin = builtin,
Expand All @@ -283,9 +282,7 @@ pub fn lookup(b: *const Builtins, name: []const u8) Expanded {

pub fn getOrCreate(b: *Builtins, comp: *Compilation, name: []const u8, type_arena: std.mem.Allocator) !?Expanded {
const ty = b._name_to_type_map.get(name) orelse {
@setEvalBranchQuota(10_000);
const tag = std.meta.stringToEnum(BuiltinFunction.Tag, name) orelse return null;
const builtin = BuiltinFunction.fromTag(tag);
const builtin = BuiltinFunction.fromName(name) orelse return null;
if (!comp.hasBuiltinFunction(builtin)) return null;

try b._name_to_type_map.ensureUnusedCapacity(comp.gpa, 1);
Expand All @@ -297,7 +294,7 @@ pub fn getOrCreate(b: *Builtins, comp: *Compilation, name: []const u8, type_aren
.ty = ty,
};
};
const builtin = BuiltinFunction.fromTag(std.meta.stringToEnum(BuiltinFunction.Tag, name).?);
const builtin = BuiltinFunction.fromName(name).?;
return .{
.builtin = builtin,
.ty = ty,
Expand All @@ -313,9 +310,9 @@ test "All builtins" {

const type_arena = arena.allocator();

for (0..@typeInfo(BuiltinFunction.Tag).Enum.fields.len) |i| {
const tag: BuiltinFunction.Tag = @enumFromInt(i);
const name = @tagName(tag);
var builtin_it = BuiltinFunction.BuiltinsIterator{};
while (builtin_it.next()) |entry| {
const name = try type_arena.dupe(u8, entry.name);
if (try comp.builtins.getOrCreate(&comp, name, type_arena)) |func_ty| {
const get_again = (try comp.builtins.getOrCreate(&comp, name, std.testing.failing_allocator)).?;
const found_by_lookup = comp.builtins.lookup(name);
Expand All @@ -337,10 +334,10 @@ test "Allocation failures" {
const type_arena = arena.allocator();

const num_builtins = 40;
for (0..num_builtins) |i| {
const tag: BuiltinFunction.Tag = @enumFromInt(i);
const name = @tagName(tag);
_ = try comp.builtins.getOrCreate(&comp, name, type_arena);
var builtin_it = BuiltinFunction.BuiltinsIterator{};
for (0..num_builtins) |_| {
const entry = builtin_it.next().?;
_ = try comp.builtins.getOrCreate(&comp, entry.name, type_arena);
}
}
};
Expand Down
2 changes: 1 addition & 1 deletion src/CodeGen.zig
Original file line number Diff line number Diff line change
Expand Up @@ -1162,7 +1162,7 @@ fn genBoolExpr(c: *CodeGen, base: NodeIndex, true_label: Ir.Ref, false_label: Ir
fn genBuiltinCall(c: *CodeGen, builtin: BuiltinFunction, arg_nodes: []const NodeIndex, ty: Type) Error!Ir.Ref {
_ = arg_nodes;
_ = ty;
return c.comp.diag.fatalNoSrc("TODO CodeGen.genBuiltinCall {s}\n", .{@tagName(builtin.tag)});
return c.comp.diag.fatalNoSrc("TODO CodeGen.genBuiltinCall {s}\n", .{BuiltinFunction.nameFromTag(builtin.tag).span()});
}

fn genCall(c: *CodeGen, fn_node: NodeIndex, arg_nodes: []const NodeIndex, ty: Type) Error!Ir.Ref {
Expand Down
4 changes: 1 addition & 3 deletions src/Compilation.zig
Original file line number Diff line number Diff line change
Expand Up @@ -1400,9 +1400,7 @@ pub fn hasBuiltin(comp: *const Compilation, name: []const u8) bool {
std.mem.eql(u8, name, "__builtin_offsetof") or
std.mem.eql(u8, name, "__builtin_types_compatible_p")) return true;

@setEvalBranchQuota(10_000);
const tag = std.meta.stringToEnum(BuiltinFunction.Tag, name) orelse return false;
const builtin = BuiltinFunction.fromTag(tag);
const builtin = BuiltinFunction.fromName(name) orelse return false;
return comp.hasBuiltinFunction(builtin);
}

Expand Down
2 changes: 1 addition & 1 deletion src/Diagnostics.zig
Original file line number Diff line number Diff line change
Expand Up @@ -2677,7 +2677,7 @@ pub fn renderMessage(comp: *Compilation, m: anytype, msg: Message) void {
}),
.builtin_with_header => m.print(info.msg, .{
@tagName(msg.extra.builtin_with_header.header),
@tagName(msg.extra.builtin_with_header.builtin),
BuiltinFunction.nameFromTag(msg.extra.builtin_with_header.builtin).span(),
}),
else => @compileError("invalid extra kind " ++ @tagName(info.extra)),
}
Expand Down
23 changes: 16 additions & 7 deletions src/Parser.zig
Original file line number Diff line number Diff line change
Expand Up @@ -4622,7 +4622,10 @@ const CallExpr = union(enum) {
return switch (self) {
.standard => true,
.builtin => |builtin| switch (builtin.tag) {
.__builtin_va_start, .__va_start, .va_start => arg_idx != 1,
BuiltinFunction.tagFromName("__builtin_va_start").?,
BuiltinFunction.tagFromName("__va_start").?,
BuiltinFunction.tagFromName("va_start").?,
=> arg_idx != 1,
else => true,
},
};
Expand All @@ -4632,8 +4635,11 @@ const CallExpr = union(enum) {
return switch (self) {
.standard => true,
.builtin => |builtin| switch (builtin.tag) {
.__builtin_va_start, .__va_start, .va_start => arg_idx != 1,
.__builtin_complex => false,
BuiltinFunction.tagFromName("__builtin_va_start").?,
BuiltinFunction.tagFromName("__va_start").?,
BuiltinFunction.tagFromName("va_start").?,
=> arg_idx != 1,
BuiltinFunction.tagFromName("__builtin_complex").? => false,
else => true,
},
};
Expand All @@ -4650,8 +4656,11 @@ const CallExpr = union(enum) {

const builtin_tok = p.nodes.items(.data)[@intFromEnum(self.builtin.node)].decl.name;
switch (self.builtin.tag) {
.__builtin_va_start, .__va_start, .va_start => return p.checkVaStartArg(builtin_tok, first_after, param_tok, arg, arg_idx),
.__builtin_complex => return p.checkComplexArg(builtin_tok, first_after, param_tok, arg, arg_idx),
BuiltinFunction.tagFromName("__builtin_va_start").?,
BuiltinFunction.tagFromName("__va_start").?,
BuiltinFunction.tagFromName("va_start").?,
=> return p.checkVaStartArg(builtin_tok, first_after, param_tok, arg, arg_idx),
BuiltinFunction.tagFromName("__builtin_complex").? => return p.checkComplexArg(builtin_tok, first_after, param_tok, arg, arg_idx),
else => {},
}
}
Expand All @@ -4665,7 +4674,7 @@ const CallExpr = union(enum) {
return switch (self) {
.standard => null,
.builtin => |builtin| switch (builtin.tag) {
.__builtin_complex => 2,
BuiltinFunction.tagFromName("__builtin_complex").? => 2,
else => null,
},
};
Expand All @@ -4675,7 +4684,7 @@ const CallExpr = union(enum) {
return switch (self) {
.standard => callable_ty.returnType(),
.builtin => |builtin| switch (builtin.tag) {
.__builtin_complex => {
BuiltinFunction.tagFromName("__builtin_complex").? => {
const last_param = p.list_buf.items[p.list_buf.items.len - 1];
return p.nodes.items(.ty)[@intFromEnum(last_param)].makeComplex();
},
Expand Down
Loading

0 comments on commit b226d56

Please sign in to comment.