How to match string literals case insensitively #216

azul · 2019-12-08T11:05:01Z

I'm working on upgrading a codebase from peg 0.5 to 0.6.

There's a bunch of rules like this one in there:

pub cmd_help -> SmtpCommand
        = "help"i s:strparam* NL
        { SmtpCommand::Help(s) }

In 0.6 the i modifier of "help"i no longer works. What's the recommended way to do this now?

The text was updated successfully, but these errors were encountered:

kevinmehall · 2019-12-11T05:47:35Z

The "str"i syntax got removed because as a procedural macro, the syntax now has to conform to Rust's tokenization rules. Having it built-in also brought up issues of what exactly "case-insensitive" means in the context of Unicode.

However, the addition of rule arguments allows you to define your own rule that matches a given literal string. Something like:

rule i(literal: &'static str)
    = input:$([_]*<{literal.len()}>)
      {? if input.eq_ignore_ascii_case(literal) { Ok(()) } else { Err(literal) } }
    
pub rule cmd_help() -> SmtpCommand
    = i("help") s:strparam()* NL() { SmtpCommand::Help(s) }

Breaking that down, [_] accepts any single character, [_]*<{literal.len()}> accepts a string with the same length as the literal, $() gets the corresponding slice of the input string, and the {? } tests whether it is a case-insensitive match or returns the literal as the "Expected" error message. That should work for ASCII literals; would need some additional complexity to handle Unicode.

This is the kind of thing that I'd like to eventually get into a kind of "standard library" of common rules: #201.

kevinmehall · 2019-12-11T06:06:25Z

Alternatively, if all the literals in your language are case insensitive, you could define the grammar for your own struct wrapping str or &[u8] and redefine "" literals to behave however you want. Literals compile down to calls to ParseLiteral::parse_string_literal, which you could define for a custom input type in a way that uses case-insensitive string comparison instead of bytewise comparison.

azul · 2019-12-11T12:20:09Z

I still have to try this out. But looks like it's gonna work.
Will close this issue for now and reopen if i run into problems.

azul · 2019-12-12T09:19:31Z

Worked like a charm for me.

azul changed the title ~~How to match string literals in case insensitively~~ How to match string literals case insensitively Dec 8, 2019

azul closed this as completed Dec 11, 2019

kevinmehall mentioned this issue Jan 11, 2023

Case Sensitivity #332

Closed

emk mentioned this issue Oct 6, 2023

Case-insensitive literal matches throw off error position #361

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to match string literals case insensitively #216

How to match string literals case insensitively #216

azul commented Dec 8, 2019

kevinmehall commented Dec 11, 2019 •

edited

Loading

kevinmehall commented Dec 11, 2019

azul commented Dec 11, 2019

azul commented Dec 12, 2019

How to match string literals case insensitively #216

How to match string literals case insensitively #216

Comments

azul commented Dec 8, 2019

kevinmehall commented Dec 11, 2019 • edited Loading

kevinmehall commented Dec 11, 2019

azul commented Dec 11, 2019

azul commented Dec 12, 2019

kevinmehall commented Dec 11, 2019 •

edited

Loading