Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Special characters in URLs are not XML-converted when generating the sitemaps #502

Open
nanawel opened this issue Sep 25, 2021 · 3 comments

Comments

@nanawel
Copy link

nanawel commented Sep 25, 2021

If you have a post with a URL like, let's say, something-&-some-other-thing, the generated XML for the sitemaps is invalid because the & is not converted to its valid XML-safe equivalent %26 (or &, that should work here too).

It seems to be normal since the URL is directly injected in the XML without any conversion. See for example:

echo '<url><loc>' . $p->url . '</loc><priority>' . $priority . '</priority><lastmod>' . date('Y-m-d', $p->date) . '</lastmod></url>';

I suppose a quick fix might be to use htmlentities($p->url, ENT_XML1, 'UTF-8') instead.

A more robust way would be to rewrite the sitemaps generation and use a XML library instead of building it from text blocks, but that's a whole different scope.

I don't have the time to propose a fully-tested PR for now, but I'll take a look if I can.

@a11y-bit
Copy link

how did you generate the sitemap?

@nanawel
Copy link
Author

nanawel commented Jan 24, 2023

how did you generate the sitemap?

Well, I never did. It's just the default generation provided by HTMLy when opening <base_url>/sitemap.xml.

@a11y-bit
Copy link

thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants