Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regexps with Unicode flag #4

Open
gcampax opened this issue Jun 18, 2020 · 3 comments
Open

Regexps with Unicode flag #4

gcampax opened this issue Jun 18, 2020 · 3 comments
Assignees

Comments

@gcampax
Copy link

gcampax commented Jun 18, 2020

Hi! Thanks for this great library.

I'm wondering if it would be possible to add support for Unicode regular expressions? Currently, it seems that the "u" flag is discarded when the lexer compiles a rule. Unicode regular expressions are necessary to properly match non BMP characters (e.g. emojis) as well as use Unicode properties as character classes.

@sormy
Copy link
Owner

sormy commented Jun 20, 2020

It should work well with regexps like /something/u.
Please provide minimal working bug reproduction example.

@gcampax
Copy link
Author

gcampax commented Jun 21, 2020

It doesn't work once you start using definitions: the u flag is not allowed in the regexp passed to addDefinition. The syntax for definitions is also problematic, as it relies on a deprecated syntax that Unicode regexps don't allow.

@sormy
Copy link
Owner

sormy commented May 23, 2021

I guess we should use u flag by default. PR is welcome.

@sormy sormy self-assigned this Aug 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants