-
-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add xml declaration to force utf-8 #71
base: main
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #71 +/- ##
============================================
- Coverage 99.30% 98.65% -0.66%
- Complexity 125 130 +5
============================================
Files 15 15
Lines 288 297 +9
============================================
+ Hits 286 293 +7
- Misses 2 4 +2
Continue to review full report at Codecov.
|
@theseer Does this make sense to you? |
First off: I'm not sure whether C14N serializes the DOM-Subtree at hand using UTF-8 because it being the internal encoding within libXML or using the original encoding as specified for the containing document. If it is indeed ignoring the original ecoding, the patch is technically correct but should be superfluous as UTF-8 is the default. It's generally considered nice to specify it anyhow, though. If C14N* is not using UTF-8 but honors the original document encoding, |
As I remember `XML Canonicalisation` spec has utf-8 as requirement.
… On 9 May 2019, at 05:02, Arne Blankerts ***@***.***> wrote:
First off: I'm not sure whether C14N serializes the DOM-Subtree at hand using UTF-8 because it being the internal encoding within libXML or using the original encoding as specified for the containing document.
If it is indeed ignoring the original ecoding, the patch is technically correct but should be superfluous as UTF-8 is the default. It's generally considered nice to specify it anyhow, though.
If C14N* is not using UTF-8 but honors the original document encoding, $node->ownerDocument->actualEncoding should be used instead of a hardcoded UTF-8. There is, to my knowledge, no way to specify the wanted target-encoding with C14N*.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#71 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AACMMF6D3QZS2S3OK35ADITPUMPVBANCNFSM4HLPBY5Q>.
|
OK, I changed it to Here is my original test example with @sebastianbergmann we can decouple XML canonicalisation out of comparator to write test on it. Should I do it? |
I had to add Elvis to fix tests. Apparently, |
@theseer according to https://www.w3.org/TR/xml-c14n11/ it says: "The document is encoded in UTF-8".
|
That doesn't work. I tried /**
* Tries to get the encoding from DOMNode or falls back to UTF-8
*/
private function getNodeEncoding(DOMNode $node): string
{
$encoding = '';
if ($node->ownerDocument) {
$encoding = $node->ownerDocument->xmlEncoding ?: $node->ownerDocument->encoding;
} elseif ($node instanceof DOMEntity) {
$encoding = $node->encoding;
}
return $encoding ?: 'UTF-8';
} I even tried just setting it to @eclipxe13 you can play with it yourself https://3v4l.org/FvCAs I'll update the PR w/ the test shortly. |
Please avoid useless commit messages such as |
@sebastianbergmann I've force pushed the squashed branch. Also, there is an option on GitHub to squash commits before merging. |
5fb0da9
to
f6f39fd
Compare
@b1rdex I see your point. Nothing changes until I add a That is because This code also works: $document = new DOMDocument();
@$document->loadXml(node->C14N());
// at this point $document->xmlEncoding is null, it is also read-only and has the undesired encoding
$document->encoding = $this->getNodeEncoding($node);
// at this point you have the output of `saveXml()` as you expect |
Feel free to send the update. Lets check it against the test I provided.
Отправлено с iPhone
… 12 июля 2020 г., в 09:47, Carlos C Soto ***@***.***> написал(а):
@b1rdex I see your point. Nothing changes until I add a <?xml version="1.0" encoding="ENCODING" ?> header.
That is because loadXml() reset the document encoding, even when it was set on construction.
This code also works:
$document = new DOMDocument();
@$document->loadXml(node->C14N());
// at this point $document->xmlEncoding is null, it is also read-only and has the undesired encoding
$document->encoding = $this->getNodeEncoding($node);
// at this point you have the output of `saveXml()` as you expect
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
See #70