Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leading and trailing non-whitespace Unicode whitespace is stripped from paragraphs #132

Open
ScottAbbey opened this issue Dec 5, 2017 · 0 comments

Comments

@ScottAbbey
Copy link

The CommonMark dingus implementation currently strips non-whitespace Unicode whitespace from
the start and end of paragraphs. The CommonMark specification and the cmark implementation seem to indicate that these characters should not be stripped.

Example link

Each "space" character in this example is U+1680, OGHAM SPACE MARK, chosen from the list of Unicode whitespace characters that are not in the list of whitespace characters.

Note that the leading and trailing U+1680 characters have been trimmed from the final result.

From CommonMark 0.28, section 4.8:

The paragraph’s raw content is formed by concatenating the lines and removing initial and final [whitespace].

By comparison, in cmark 0.28.3:

Input:

 o o 
 o o 

Output:

<p> o o 
 o o </p>

Ref: commonmark/commonmark-spec#465

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant