Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matching @username formatted usernames encounters collisions with generated token strings #78

Open
mhuggins opened this issue Jul 10, 2024 · 3 comments
Assignees
Labels

Comments

@mhuggins
Copy link

Hello, thank you for sharing this package!

I am trying to match several patterns of strings within a body of text, namely:

  1. Hash tags (e.g.: #programming),
  2. User names (e.g.: @mhuggins), and
  3. URLs.

I've defined three custom matchers as part of a wrapper component:

import { router } from 'expo-router';
import { Linking, StyleProp, TextStyle } from 'react-native';
import Autolink, { CustomMatcher } from 'react-native-autolink';

const linkStyle: StyleProp<TextStyle> = {
  color: '#0a7ea4',
};

const HashTagMatcher: CustomMatcher = {
  pattern: /#([a-z0-9_\-]+)/g,
  style: linkStyle,
  getLinkText: (replacerArgs) => `#${replacerArgs[1]}`,
  onPress: (match) => {
    const tag = match.getReplacerArgs()[1];
    router.navigate(`/tags/${encodeURIComponent(tag)}`);
  },
};

const UserMatcher: CustomMatcher = {
  pattern: /@([a-z0-9_\.]+)/g,
  style: linkStyle,
  getLinkText: (replacerArgs) => `@${replacerArgs[1]}`,
  onPress: (match) => {
    const userId = match.getReplacerArgs()[1];
    router.navigate(`/users/${encodeURIComponent(userId)}`);
  },
};

const UrlMatcher: CustomMatcher = {
  pattern: /https?:\/\/([\w_-]+(?:(?:\.[\w_-]+)+))([\w.,@?^=%&:\/~+#-]*[\w@?^=%&\/~+#-])/g,
  style: linkStyle,
  getLinkText: (replacerArgs) => replacerArgs[0],
  onPress: (match) => {
    const url = match.getMatchedText();
    Linking.openURL(url);
  },
};

const matchers: CustomMatcher[] = [HashTagMatcher, UserMatcher, UrlMatcher];

export const PostBody = ({ text }: { text: string }) => (
  <Autolink text={text} matchers={matchers} />
);

I then try to use this on a code of text, e.g.:

I am linking to https://google.com because I'm a #corporateshill. All hail @google!

This ends up replacing some matched elements with tokens in the format @__ELEMENT-${uid}-\\d+__@, at which point my HashTagMatcher matches against combinations of string created from these tokens such as @__ and @..

Screenshot 2024-07-10 at 8 52 32 AM

Is there an existing way to avoid this, or is this a bug that needs to be reconciled?

@joshswan
Copy link
Owner

joshswan commented Jul 10, 2024

TLDR: If you turn off the built in URL matching and put your UserMatcher first in the array of custom matchers, you can get around the issue for now: <Autolink url={false} matchers={[UserMatcher, HashTagMatcher, UrlMatcher]} text="text="I am linking to https://google.com because I'm a #corporateshill. All hail @google!" /> works. Alternatively you can tweak the regex of the UserMatcher.

Interestingly, the pattern you're using for the user matcher is capturing the internal "replacement token" that's used to mark matches in the text before rendering. Either that token is going to need to be updated to something super obscure, or the logic needs to change internally. Currently it's @__ELEMENT-${uid}-${counter++}__@, which will cause issues with mention-related regexes for some.

Separately, url was meant to be disabled by default but it's not. Will require a major version bump now.

@joshswan joshswan self-assigned this Jul 10, 2024
@joshswan joshswan added the bug label Jul 10, 2024
@mhuggins
Copy link
Author

I think it might be beneficial to not utilize tokens that get injected into the string, and to instead compose an array of parts that get joined together at the end. Tokenization will always result in the possibility of conflicts from user-provided regex patterns.

@joshswan
Copy link
Owner

Agreed. It was an easy way to get the original library working, and didn't cause any issues since I knew the regexes for the built-in functionality. But these custom matchers pose a problem for any potential token pattern.

Unfortunately that means a bit of work to be done haha. Will add to my todo list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants