Faster parsing by changing lhs of assignment-expression to conditional-expression. #18

RamiHg · 2020-05-21T16:59:09Z

Hello,

This change makes parsing much faster (6x as fast in the bench I've created: from 13.7s to 2.0s on my machine) by changing the lhs of assignment-expression to be conditional-expression, and making the rhs optional. That specific rule was a huge bottleneck, and backtracking from assignment-expression-inner (unary-expression lhs) to conditional-expression on almost every single expression was causing some non-linear behavior.

That change alone takes the bench from 13.7s to 8s. Then, removing the cache attribute from postfix_expression speeds it up further to 2s. The cache attribute is no longer needed and so only adds overhead. Before this change, running without the cache attribute would make parsing too slow to be usable.

This change technically deviates from the c11 standard. However, Clang also does the same thing (I actually got this idea from Clang). Here is their justification in ParseExpr.cpp:

/// Note: we diverge from the C99 grammar when parsing the assignment-expression
/// production.  C99 specifies that the LHS of an assignment operator should be
/// parsed as a unary-expression, but consistency dictates that it be a
/// conditional-expession.  In practice, the important thing here is that the
/// LHS of an assignment has to be an l-value, which productions between
/// unary-expression and conditional-expression don't produce.  Because we want
/// consistency, we parse the LHS as a conditional-expression, then check for
/// l-value-ness in semantic analysis stages.

Since this deviates from the strict standard implementation, I completely understand if you don't want to accept this pull. If you can refactor the rules in such a way as to achieve the same speed-up, I'd happily take that change instead!

Notes:

The bench I created was automatically generated by a tool called yarpgen. It's obviously not what real-world C code looks like, but it acts as a good stress-test because it is guaranteed to generate valid c11 code.

Cheers!

This makes parsing 6x as fast (from 13s to 2s in bench).

vickenty · 2020-05-25T23:17:20Z

Thanks, this is great!

I think this is a reasonable compromise to make, given the huge difference in performance. It shouldn't make user's life much harder, since not all unary expressions are assignable, and lvalueness needs to be checked anyway.

Is it possible to make benchmark smaller? Adding 10M to the repo just for this seems excessive.
Please also add a comment to the grammar file explaining why we deviate from the standard grammar.

vickenty · 2020-05-25T23:18:35Z

benches/parse_preprocessed_bench.rs

+    let mut _config = driver::Config::with_clang();
+    #[cfg(target_os = "macos")]
+    {
+        _config.cpp_options.push("-fno-blocks".into());


cpp_options are not used when calling parse_preprocessed. I think this block can be removed.

Changing left-hand side of assignment expression to be conditional.

6134dba

This makes parsing 6x as fast (from 13s to 2s in bench).

vickenty reviewed May 25, 2020

View reviewed changes

This was referenced Dec 27, 2020

Changing left-hand side of assignment expression to be conditional. nacaclanga/lang-c#1

Merged

Optionally switch Identifiers form std::String to a different type in order to support string interning #30

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster parsing by changing lhs of assignment-expression to conditional-expression. #18

Faster parsing by changing lhs of assignment-expression to conditional-expression. #18

RamiHg commented May 21, 2020

vickenty commented May 25, 2020

vickenty May 25, 2020

Faster parsing by changing lhs of assignment-expression to conditional-expression. #18

Are you sure you want to change the base?

Faster parsing by changing lhs of assignment-expression to conditional-expression. #18

Conversation

RamiHg commented May 21, 2020

vickenty commented May 25, 2020

vickenty May 25, 2020

Choose a reason for hiding this comment