Skip to content

Latest commit

 

History

History
301 lines (230 loc) · 9.42 KB

AutoLinks.md

File metadata and controls

301 lines (230 loc) · 9.42 KB

Extensions

This section describes the different extensions supported:

AutoLinks

Autolinks will format as a HTML link any string that starts by:

  • http:// or https://
  • ftp://
  • mailto:
  • www.
This is a http://www.google.com URL and https://www.google.com
This is a ftp://test.com
And a mailto:[email protected]
And a plain www.google.com
.
<p>This is a <a href="http://www.google.com">http://www.google.com</a> URL and <a href="https://www.google.com">https://www.google.com</a>
This is a <a href="ftp://test.com">ftp://test.com</a>
And a <a href="mailto:[email protected]">[email protected]</a>
And a plain <a href="http://www.google.com">www.google.com</a></p>

But incomplete links will not be matched:

This is not a http:/www.google.com URL and https:/www.google.com
This is not a ftp:/test.com
And not a mailto:emailtoto.com
And not a plain www. or a www.x 
.
<p>This is not a http:/www.google.com URL and https:/www.google.com
This is not a ftp:/test.com
And not a mailto:emailtoto.com
And not a plain www. or a www.x</p>

Previous character must be a punctuation or a valid space (tab, space, new line):

This is not a nhttp://www.google.com URL but this is (https://www.google.com)
.
<p>This is not a nhttp://www.google.com URL but this is (<a href="https://www.google.com">https://www.google.com</a>)</p>

An autolink should not interfere with an <a> HTML inline:

This is an HTML <a href="http://www.google.com">http://www.google.com</a> link
.
<p>This is an HTML <a href="http://www.google.com">http://www.google.com</a> link</p>

or even within emphasis:

This is an HTML <a href="http://www.google.com"> **http://www.google.com** </a> link
.
<p>This is an HTML <a href="http://www.google.com"> <strong>http://www.google.com</strong> </a> link</p>

An autolink should not interfere with a markdown link:

This is an HTML [http://www.google.com](http://www.google.com) link
.
<p>This is an HTML <a href="http://www.google.com">http://www.google.com</a> link</p>

A link embraced by pending emphasis should let the emphasis takes precedence if characters are placed at the end of the matched link:

Check **http://www.a.com** or __http://www.b.com__
.
<p>Check <strong><a href="http://www.a.com">http://www.a.com</a></strong> or <strong><a href="http://www.b.com">http://www.b.com</a></strong></p>

It is not mentioned by the spec, but empty emails won't be matched (only a subset of RFC2368 is supported by auto links):

mailto:[email protected] is okay, but mailto:@test.com is not
.
<p><a href="mailto:[email protected]">[email protected]</a> is okay, but mailto:@test.com is not</p>

GFM Support

Extract from GFM Autolinks extensions specs

www.commonmark.org
.
<p><a href="http://www.commonmark.org">www.commonmark.org</a></p>
Visit www.commonmark.org/help for more information.
.
<p>Visit <a href="http://www.commonmark.org/help">www.commonmark.org/help</a> for more information.</p>
Visit www.commonmark.org.

Visit www.commonmark.org/a.b.
.
<p>Visit <a href="http://www.commonmark.org">www.commonmark.org</a>.</p>
<p>Visit <a href="http://www.commonmark.org/a.b">www.commonmark.org/a.b</a>.</p>
www.google.com/search?q=Markup+(business)

(www.google.com/search?q=Markup+(business))
.
<p><a href="http://www.google.com/search?q=Markup+(business)">www.google.com/search?q=Markup+(business)</a></p>
<p>(<a href="http://www.google.com/search?q=Markup+(business)">www.google.com/search?q=Markup+(business)</a>)</p>
www.google.com/search?q=commonmark&hl=en

www.google.com/search?q=commonmark&hl;
.
<p><a href="http://www.google.com/search?q=commonmark&amp;hl=en">www.google.com/search?q=commonmark&amp;hl=en</a></p>
<p><a href="http://www.google.com/search?q=commonmark">www.google.com/search?q=commonmark</a>&amp;hl;</p>
www.commonmark.org/he<lp
.
<p><a href="http://www.commonmark.org/he">www.commonmark.org/he</a>&lt;lp</p>
http://commonmark.org

(Visit https://encrypted.google.com/search?q=Markup+(business))

Anonymous FTP is available at ftp://foo.bar.baz.
.
<p><a href="http://commonmark.org">http://commonmark.org</a></p>
<p>(Visit <a href="https://encrypted.google.com/search?q=Markup+(business)">https://encrypted.google.com/search?q=Markup+(business)</a>)</p>
<p>Anonymous FTP is available at <a href="ftp://foo.bar.baz">ftp://foo.bar.baz</a>.</p>

Valid Domain Tests

Domain names that have empty segments won't be matched

www..
www..com
http://test.
http://.test
http://.
http://..
ftp://test.
ftp://.test
mailto:email@test.
mailto:[email protected]
.
<p>www..
www..com
http://test.
http://.test
http://.
http://..
ftp://test.
ftp://.test
mailto:email@test.
mailto:[email protected]</p>

Domain names with too few segments won't be matched

www
www.com
http://test
ftp://test
mailto:email@test
.
<p>www
www.com
http://test
ftp://test
mailto:email@test</p>

Domain names that contain an underscores in the last two segments won't be matched

www._test.foo.bar is okay, but www._test.foo is not

http://te_st.foo.bar is okay, as is http://test.foo_.bar.foo

But http://te_st.foo, http://test.foo_.bar and http://test._foo are not

ftp://test_.foo.bar is okay, but ftp://test.fo_o is not

mailto:email@_test.foo.bar is okay, but mailto:email@_test.foo is not
.
<p><a href="http://www._test.foo.bar">www._test.foo.bar</a> is okay, but www._test.foo is not</p>
<p><a href="http://te_st.foo.bar">http://te_st.foo.bar</a> is okay, as is <a href="http://test.foo_.bar.foo">http://test.foo_.bar.foo</a></p>
<p>But http://te_st.foo, http://test.foo_.bar and http://test._foo are not</p>
<p><a href="ftp://test_.foo.bar">ftp://test_.foo.bar</a> is okay, but ftp://test.fo_o is not</p>
<p><a href="mailto:email@_test.foo.bar">email@_test.foo.bar</a> is okay, but mailto:email@_test.foo is not</p>

Domain names that contain invalid characters (not AlphaNumberic, -, _ or .) won't be matched

https://[your-domain]/api
.
<p>https://[your-domain]/api</p>

Domain names followed by ?, : or # instead of / are matched

https://github.com?

https://github.com?a

https://github.com#a

https://github.com:

https://github.com:443
.
<p><a href="https://github.com">https://github.com</a>?</p>
<p><a href="https://github.com?a">https://github.com?a</a></p>
<p><a href="https://github.com#a">https://github.com#a</a></p>
<p><a href="https://github.com">https://github.com</a>:</p>
<p><a href="https://github.com:443">https://github.com:443</a></p>

Unicode support

Links with unicode characters in the path / query / fragment are matched and url encoded

http://abc.net/☃

http://abc.net?☃

http://abc.net#☃

http://abc.net/foo#☃
.
<p><a href="http://abc.net/%E2%98%83">http://abc.net/☃</a></p>
<p><a href="http://abc.net?%E2%98%83">http://abc.net?☃</a></p>
<p><a href="http://abc.net#%E2%98%83">http://abc.net#☃</a></p>
<p><a href="http://abc.net/foo#%E2%98%83">http://abc.net/foo#☃</a></p>

Unicode characters in the FQDN are matched and IDNA encoded

http://☃.net?☃
.
<p><a href="http://xn--n3h.net?%E2%98%83">http://☃.net?☃</a></p>

Same goes for regular autolinks

<http://abc.net/☃>

<http://abc.net?☃>

<http://abc.net#☃>

<http://abc.net/foo#☃>
.
<p><a href="http://abc.net/%E2%98%83">http://abc.net/☃</a></p>
<p><a href="http://abc.net?%E2%98%83">http://abc.net?☃</a></p>
<p><a href="http://abc.net#%E2%98%83">http://abc.net#☃</a></p>
<p><a href="http://abc.net/foo#%E2%98%83">http://abc.net/foo#☃</a></p>
<http://☃.net?☃>
.
<p><a href="http://xn--n3h.net?%E2%98%83">http://☃.net?☃</a></p>

It also complies with CommonMark's vision of priority. This will therefore be seen as an autolink and not as code inline.

<http://foö.bar.`baz>`
.
<p><a href="http://xn--fo-gka.bar.%60baz">http://foö.bar.`baz</a>`</p>