Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

markdown_to_html different/incorrect behavior with leading line-break #3685

Open
Sarke opened this issue Apr 14, 2022 · 4 comments
Open

markdown_to_html different/incorrect behavior with leading line-break #3685

Sarke opened this issue Apr 14, 2022 · 4 comments
Labels

Comments

@Sarke
Copy link

Sarke commented Apr 14, 2022

I thought I was going mad at one point. I've simplified the issue to the following code:

$loader = new \Twig\Loader\ArrayLoader([
    'index' => '{{ text|markdown_to_html }}',
]);
$twig = new \Twig\Environment($loader);
$twig->addExtension(new MarkdownExtension());

$twig->addRuntimeLoader(new class implements RuntimeLoaderInterface {
    public function load($class) {
        if (MarkdownRuntime::class === $class) {
            return new MarkdownRuntime(new DefaultMarkdown());
        }
    }
});

$markdown = <<<MD
Paragraph 1

Paragraph 2

1. First

2. Second
MD;

echo $twig->render('index', ['text' => $markdown]);

echo "\n------\n\n";

// only difference is the leading line-break
$markdown = "\n" . $markdown;

echo $twig->render('index', ['text' => $markdown]);

Output

<p>Paragraph 1</p>

<p>Paragraph 2</p>

<ol>
<li><p>First</p></li>
<li><p>Second</p></li>
</ol>

------

<p>Paragraph 1<br />
Paragraph 2<br />
1. First<br />
2. Second</p>

I originally noticed it when I was getting the second (incorrect) results with this, which seems harmless enough.

{% apply markdown_to_html %}

Title

Similar effects with other markdown libraries. Note that passing a leading line-break to the markdown library directly (without Twig) works as expected.

michelf/php-markdown               1.9.1
twig/markdown-extra                v3.3.8
twig/twig                          v3.3.9
@stof
Copy link
Member

stof commented Apr 14, 2022

The filter has a behavior to remove the indentation:

// remove indentation
if ($white = substr($body, 0, strspn($body, " \t\r\n\0\x0B"))) {
$body = preg_replace("{^$white}m", '', $body);
}

This seems to be what breaks when the markdown start with an empty line, because it will then remove the empty line between Paragraph 1 and Paragraph 2

@Sarke
Copy link
Author

Sarke commented Apr 15, 2022

Well that doesn't seem like a good way of doing it. Not only for my issue of no indentation, but markdown can legitimately start with an indented code block, followed by non-indented paragraphs, etc.

A workaround for now would be to do {{ text | trim | markdown_to_html }}, which solves my problem, but it wouldn't work with the mentioned indented code block.

@colinodell
Copy link
Contributor

Maybe Twig could borrow the approach I implemented in my colinodell/indentation library? It basically:

  1. Checks the indentation line-by-line to see if all lines have a common amount and type of indentation.
  2. We abort if any line is found without indentation, or if there's a mix of tabs for some lines and spaces for others
  3. Once we know the common indentation we strip that from all lines

Here are some test cases showing the results.

I don't know if this approach is the most efficient but it seems fairly reliable.

@Sarke
Copy link
Author

Sarke commented Apr 19, 2022

Maybe Twig could borrow the approach I implemented in my colinodell/indentation library?

Maybe. I'm not sure such an intricate solution is required though.

  1. Checks the indentation line-by-line to see if all lines have a common amount and type of indentation.

There's a couple of differences here. I ignore empty lines, since many IDEs tend to remove whitespace on empty lines. I also treat tabs as 4 spaces, since this is how Markdown sees them.

  1. We abort if any line is found without indentation, or if there's a mix of tabs for some lines and spaces for others

Thanks, I added the early return.

Here are some test cases showing the results.

Thanks. We could add some tests for the indentation detection, but I think the Markdown tests cover it. I added a comparison between the Twig output (after the indentation has been removed and the Markdown converted to HTML) and the direct Markdown output from the Markdown library in use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

4 participants