-
-
Notifications
You must be signed in to change notification settings - Fork 320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider preventing autolinks in links #719
Comments
Thanks for raising this. It would be good to get more feedback on the proposed solutions. |
wooorm
added a commit
to syntax-tree/mdast
that referenced
this issue
Feb 4, 2023
Previously, there was a difference between static and dynamic phrasing content. There was also transparent content, to “inherit” the (static or dynamic) content model from a parent. This was used to prohibit links in links. However, while it is unwise to put links in links, it is possible with markdown to do so, by embedding autolinks in links with resources or references. Take for example: ```markdown [alpha <https://bravo> charlie](delta) ``` Yields, per CommonMark: ```html <p><a href="delta">alpha <a href="https://bravo">https://bravo</a> charlie</a></p> ``` See also: commonmark/commonmark-spec#719. There is also the case where while it is unwise to author such markdown, tools transforming mdast might inject links. It’s probably good to handle such cases in `mdast-util-to-markdown` and such? Not allowing links in links is also something imposed by HTML (particularly the parsing side[^1]): perhaps mdast could be used in places where this restriction doesn’t need to be imposed. Finally, it is hard to type such inherited content models. As in, transparent content is not used in `@types/mdast` (I don’t think it can be typed?): https://github.com/DefinitelyTyped/DefinitelyTyped/blob/3b052a3e5b59b0efaa22669d9bd8f268bd689835/types/mdast/index.d.ts#L286 So, this PR is a proposal to remove the difference and simplify the AST. [^1]: You can produce nested `a`s with the DOM: ```js let a1 = document.createElement('a'); a1.textContent = 'alpha'; let a2 = document.createElement('a'); a2.textContent = 'bravo'; a1.appendChild(a2) document.body.appendChild(a1) ```
5 tasks
wooorm
added a commit
to syntax-tree/mdast
that referenced
this issue
Jun 15, 2023
Previously, there was a difference between static and dynamic phrasing content. There was also transparent content, to “inherit” the (static or dynamic) content model from a parent. This was used to prohibit links in links. However, while it is unwise to put links in links, it is possible with markdown to do so, by embedding autolinks in links with resources or references. Take for example: ```markdown [alpha <https://bravo> charlie](delta) ``` Yields, per CommonMark: ```html <p><a href="delta">alpha <a href="https://bravo">https://bravo</a> charlie</a></p> ``` See also: commonmark/commonmark-spec#719. There is also the case where while it is unwise to author such markdown, tools transforming mdast might inject links. It’s probably good to handle such cases in `mdast-util-to-markdown` and such? Not allowing links in links is also something imposed by HTML (particularly the parsing side[^1]): perhaps mdast could be used in places where this restriction doesn’t need to be imposed. Finally, it is hard to type such inherited content models. As in, transparent content is not used in `@types/mdast` (I don’t think it can be typed?): https://github.com/DefinitelyTyped/DefinitelyTyped/blob/3b052a3e5b59b0efaa22669d9bd8f268bd689835/types/mdast/index.d.ts#L286 So, this PR is a proposal to remove the difference and simplify the AST. Closes GH-42. [^1]: You can produce nested `a`s with the DOM: ```js let a1 = document.createElement('a'); a1.textContent = 'alpha'; let a2 = document.createElement('a'); a2.textContent = 'bravo'; a1.appendChild(a2) document.body.appendChild(a1) ```
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Background
The CommonMark spec prevents “actual” links (those with a reference (e.g.,
[x]
), or those with a resource (e.g.,(x "y")
), from occurring inside each other:The reason for that, is that the HTML spec does not allow it, and specifically the parsing algorithm of the HTML spec makes it impossible:
Yields:
(the first link is closed when a new link is opened, its link end is later ignored).
CommonMark does this by, when a link is matched (the “deep” one is matched first), marking earlier link starts as “inactive”.
It does not do that for links in images, or images in links, because it doesn‘t need to: images in links are fine, and links “in” images do not generate HTML (somewhat related: #716)
Problem
However, there is another way to create links in links with markdown: autolinks (
<https://example.com>
) inside links:Yields:
Solution
There are two solutions.
Which to choose, depends on which of the two links is more likely the one an author intended.
Solution A: mark link starts as inactive
The code for this would be very similar to the one already used for “normal” links: when a valid autolink is parsed, mark earlier link starts as inactive. The output then would be:
Solution B: outputting the source of an autolink when compiling, when in links
The code for this would be very similar to what is likely needed for images, when compiling.
When in an image alt, no tags are generated.
Something similar can be done when inside a link: output
<
(instead of<a href="https://example.com">
), output the text like normal, and>
(instead of</a>
).The output then would be:
Consideration
The current state is clearly broken: it’s not something someone wants or is expecting: HTML doesn’t like it and only halve of an authors outer link is “clickable”.
With solution A, both URLs are displayed verbatim, the inner one “clickable”, but both usable.
This is like how “normal” links work in CommonMark, so I think it is the best solution.
I think solution B is cleaner output, and chances are it’s more likely what an author intended, both URLs being the same, showing the URL that someone goes to:
Though, solution B gets confusing if the URLs are different, a reader sees one URL, but is taken to another:
I think both are acceptable and an improvement, and I’d prefer solution A a little bit more than solution B.
The text was updated successfully, but these errors were encountered: