Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: let's make this compatible with XPath's fn:matches and fn:replace #9

Open
6 of 13 tasks
DrRataplan opened this issue Dec 18, 2019 · 2 comments
Open
6 of 13 tasks

Comments

@DrRataplan
Copy link
Contributor

DrRataplan commented Dec 18, 2019

The patterns used in XPath are very much alike the xs:patterns used in the XSD spec, with the following changes:

Flags

  • s: the dot-all flag
  • i: the case insensitive flag
  • m: the multiline flag
  • x: the whitespace flag

Partial matches and matching the beginning and end of the input

Captured subexpressions

  • https://www.w3.org/TR/xpath-functions-31/#captured-subexpressions I think this may be implemented by recording the start and end of an expression in the whynot program, and output those when the execution is done? Thankfully there is no eagerness involved, returning any match is sufficient.
  • Non capturing subexpressions, the (?:) variant.

Reluctant quantifiers

See https://www.w3.org/TR/xpath-functions-31/#reluctant-quantifiers.

  • We only have to be able to parse them for fn:matches,
  • actually resolving them only affects functions like fn:replace.

Backreferences

  • Parsing backreferences, this already makes sense to do, but just throw a readable not-supported error when we see them. They should not be syntax errors.
  • See https://www.w3.org/TR/xpath-functions-31/#back-references for more info. We could implement these backreferences by matching anything in them and filter out the different paths through the whynot execution path to see whether there was an actual match?

Unkown unicode blocks

I think it would be cool to implement these features in this library, enabled when a language option is passed to the compile function for example.

@bwrrp
Copy link
Owner

bwrrp commented Dec 20, 2019

Sounds great! Thanks for the overview and initial PR!

For backreferences and other difficult cases we can also consider adding an interpreter as an alternative to the whynot-based approach.

@bwrrp
Copy link
Owner

bwrrp commented Mar 13, 2020

I just released 1.1.0 with your changes so far, that should be enough to implement fn:matches except for the missing flags. I think for those we can add a flags option for compile that sits next to the new language, which is simply a string containing the letters for flags to apply. It may make sense to try supporting them for both languages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants