-
-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Links with "_" in the domain name are not regarded as links #95
Comments
As far as I've been able to research, Please provide an example of widely used domains with underscores in them. Underscores in domain names are very rare because:
Linkify-it isn't meant to find every single link (which is impossible), so we have to restrict ourselves to the most common cases. I'm not sure if domains with underscores are worth supporting, especially given false-positive potential of them being introduced in fuzzy links. |
Is it possible we get this resolved already? It seems like we are discussing whether this is a valid case or not, but it's obvious that there are cases like this around the web. This library has 100% test coverage, so it's safe to add this change without worrying it would break something. We hear "false-positive potential" mentioned before, but what are the exact cases which could be false-positives? There is also other option that gets suggested - to use LinkifyIt.prototype.onCompile = function onCompile() {
const re = this.re;
const text_separators = '[><\uff5c]';
re.src_domain =
'(?:' +
re.src_xn +
'|' +
'(?:' + re.src_pseudo_letter + ')' +
'|' +
'(?:' + re.src_pseudo_letter + '(?:-|_|' + re.src_pseudo_letter + '){0,61}' + re.src_pseudo_letter + ')' +
')';
re.src_host =
'(?:' +
'(?:(?:(?:' + re.src_domain + ')\\.)*' + re.src_domain/* _root */ + ')' +
')';
re.tpl_host_fuzzy =
'(?:' +
re.src_ip4 +
'|' +
'(?:(?:(?:' + re.src_domain + ')\\.)+(?:%TLDS%))' +
')';
re.src_host_strict =
re.src_host + re.src_host_terminator;
re.tpl_host_fuzzy_strict =
re.tpl_host_fuzzy + re.src_host_terminator;
re.src_host_port_strict =
re.src_host + re.src_port + re.src_host_terminator;
re.tpl_host_port_fuzzy_strict =
re.tpl_host_fuzzy + re.src_port + re.src_host_terminator;
re.tpl_email_fuzzy =
'(^|' + text_separators + '|"|\\(|' + re.src_ZCc + ')' +
'(' + re.src_email_name + '@' + re.tpl_host_fuzzy_strict + ')';
re.tpl_link_fuzzy =
'(^|(?![.:/\\-_@])(?:[$+<=>^`|\uff5c]|' + re.src_ZPCc + '))' +
'((?![$+<=>^`|\uff5c])' + re.tpl_host_port_fuzzy_strict + re.src_path + ')';
re.tpl_link_no_ip_fuzzy =
'(^|(?![.:/\\-_@])(?:[$+<=>^`|\uff5c]|' + re.src_ZPCc + '))' +
'((?![$+<=>^`|\uff5c])' + re.tpl_host_port_no_ip_fuzzy_strict + re.src_path + ')';
}; I don't think that's maintainable on our codebase. I actually see couple of options here:
Please make some kind of decision, as doing nothing and ignoring OS community issues for years is not a valid solution. |
what is the issue?
Links with "_" in the domain name, for eg:
are not regarded as links, which is no true, see :
https://stackoverflow.com/a/2183140/8113942
the same goes for fuzzy links, for eg:
The text was updated successfully, but these errors were encountered: