Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(parsers/regexp): add injection of global flag if it's not present #61

Merged
merged 3 commits into from
Jan 15, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions docs/content/parsers/regexp.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ description: 'regexp parses a string that matches a provided regular expression.
## Signature

```ts
function regexp(re: RegExp, expected: string): Parser<string>
function regexp(rs: RegExp, expected: string): Parser<string>
```

## Description
Expand All @@ -18,9 +18,13 @@ function regexp(re: RegExp, expected: string): Parser<string>

## Implementation notes

::: warning
If `g` flag is missing, it will be automatically injected. It's still better to always provide it to avoid small performance penalty and clearly document the intention.
:::

The regular expression must obey two simple rules:

- It *does* use `g` flag. Flags like `u` and `i` are allowed and can be added if needed.
- It *does* use g flag. Flags like u and i are allowed and can be added if needed.
- It *doesn't* use `^` and `$` to match at the beginning or at the end of the text.

## Usage
Expand Down
52 changes: 46 additions & 6 deletions src/__tests__/parsers/regexp.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -17,29 +17,69 @@ describe('regexp', () => {
should.matchState(actualMatchGroups, expectedMatchGroups)
})

it('should succeed if given matching input without Global flag', () => {
const actualDigit = run(regexp(/\d/, 'digit'), '0')
const expectedDigit = result(true, '0')

const actualDigits = run(regexp(/\d+/, 'digits'), '9000')
const expectedDigits = result(true, '9000')

const actualMatchGroups = run(regexp(/\((\s)+\)/, 'match-groups'), '( )')
const expectedMatchGroups = result(true, '( )')

should.matchState(actualDigit, expectedDigit)
should.matchState(actualDigits, expectedDigits)
should.matchState(actualMatchGroups, expectedMatchGroups)
})

it('should succeed if matches the beginning of input', () => {
const actualDigits = run(regexp(/\d{2,3}/g, 'first-digits'), '90000')
const expectedDigits = result(true, '900')

should.matchState(actualDigits, expectedDigits)
})

it('should succeed if matches the beginning of input without Global flag', () => {
const actualDigits = run(regexp(/\d{2,3}/, 'first-digits'), '90000')
const expectedDigits = result(true, '900')

should.matchState(actualDigits, expectedDigits)
})

it('should succeed if given a RegExp with Unicode flag', () => {
const actualReEmoji = run(regexp(/\w+\s+👌/gu, 'words, spaces, ok emoji'), 'Yes 👌')
const expectedReEmoji = result(true, 'Yes 👌')

should.matchState(actualReEmoji, expectedReEmoji)
})

it('should succeed if given a RegExp with Unicode flag and without Global one', () => {
const actualReEmoji = run(regexp(/\w+\s+👌/u, 'words, spaces, ok emoji'), 'Yes 👌')
const expectedReEmoji = result(true, 'Yes 👌')

should.matchState(actualReEmoji, expectedReEmoji)
})

it('should succeed if given a RegExp with Unicode property escapes', () => {
const actualReEmoji = run(regexp(/\p{Emoji_Presentation}+/gu, 'emoji'), '👌👌👌')
const expectedReEmoji = result(true, '👌👌👌')

const actualReNonLatin = run(regexp(/\P{Script_Extensions=Latin}+/gu, 'non-latin'), '大阪')
const expectedReNonLation = result(true, '大阪')
const expectedReNonLatin = result(true, '大阪')

should.matchState(actualReEmoji, expectedReEmoji)
should.matchState(actualReNonLatin, expectedReNonLation)
should.matchState(actualReNonLatin, expectedReNonLatin)
})

it('should succeeed if matches the beginning of input', () => {
const actualDigits = run(regexp(/\d{2,3}/g, 'first-digits'), '90000')
const expectedDigits = result(true, '900')
it('should succeed if given a RegExp with Unicode property escapes without Global flag', () => {
const actualReEmoji = run(regexp(/\p{Emoji_Presentation}+/u, 'emoji'), '👌👌👌')
const expectedReEmoji = result(true, '👌👌👌')

should.matchState(actualDigits, expectedDigits)
const actualReNonLatin = run(regexp(/\P{Script_Extensions=Latin}+/u, 'non-latin'), '大阪')
const expectedReNonLatin = result(true, '大阪')

should.matchState(actualReEmoji, expectedReEmoji)
should.matchState(actualReNonLatin, expectedReNonLatin)
})

it('should fail if does not match input', () => {
Expand Down
11 changes: 8 additions & 3 deletions src/parsers/regexp.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,20 @@ import type { Parser } from '@types'
*
* The regular expression must obey two simple rules:
*
* - It *does* use `g` flag. Flags like `u` and `i` are allowed and can be added if needed.
* - It *does* use `g` flag. Flags like u and i are allowed and can be added if needed.
* - It *doesn't* use `^` and `$` to match at the beginning or at the end of the text.
*
* @param re - Regular expression
* If `g` flag is missing, it will be automatically injected. It's still better to always provide it
* to avoid small performance penalty and clearly document the intention.
*
* @param rs - Regular expression
* @param expected - Error message if the regular expression does not match input
*
* @returns Matched string
*/
export function regexp(re: RegExp, expected: string): Parser<string> {
export function regexp(rs: RegExp, expected: string): Parser<string> {
const re = rs.global ? rs : new RegExp(rs.source, rs.flags + 'g')

return {
parse(input, pos) {
// Reset RegExp index, because we abuse the 'g' flag.
Expand Down