Skip to content
This repository has been archived by the owner on Sep 24, 2024. It is now read-only.

CF$UID cannot be squashed to integer #71

Open
kiler129 opened this issue Feb 15, 2022 · 2 comments
Open

CF$UID cannot be squashed to integer #71

kiler129 opened this issue Feb 15, 2022 · 2 comments

Comments

@kiler129
Copy link

Describe the bug
Parsing plists produced by NSKeyedArchiver produces a tree of dependencies where references are marked with CF$UID pointing to the main tree. The library squashes these into integers making it impossible to distinguish references from real values.

To Reproduce
Attempt to parse the following plist:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
    <dict>
        <key>test1</key>
        <dict>
            <key>CF$UID</key>
            <integer>123</integer>
        </dict>
        <key>test2</key>
        <integer>321</integer>
    </dict>
</plist>

As an array it will result in:
image

This doesn't seem like a problem until you encounter a structure like:

<dict>
    <key>$class</key>
    <dict>
        <key>CF$UID</key>
        <integer>89</integer>
    </dict>

    <key>accumulatedDataSize</key>
    <integer>7723067</integer>

    <key>allURLs</key>
    <dict>
        <key>CF$UID</key>
        <integer>8</integer>
    </dict>
</dict>

This will parse to:
image

As you can see it's impossible to distinguish what's is a reference and what is an integer. Obviously these are two different things.
I traced the origin of this to 2ea0483 and no further changes has been made.

Expected behavior
UID type should be decoded to a simple DTO which can have a toInteger() method.

@kiler129
Copy link
Author

kiler129 commented Mar 3, 2022

@btry: can you maybe shed some light on this one? In the current state the data is being unavoidably lost while a plist is parsed and without a custom fork some plists representations are unusable. I can prepare a PR but I'm not sure what's the solution you guys at TECLIB will see as acceptable.


As a stop-gap a dirty workaround can be implemented without editing the library, which essentially reverts CFUid to CFDictionary while the plist is parsed. It's a dirty hack but it works. It has the downside of adding an extra ghost key to every CFUid dictionary (since there's no sane way to prevent import() => case 'dict' behavior);

/**
 * Fixes CFUid squashing to int, which causes data loss
 *
 * This exists solely as a workaround for https://github.com/TECLIB/CFPropertyList/issues/71
 */
class StrictCFPropertyList extends CFPropertyList
{
    private const CFUID_REF_KEY = 'CF$UID';
    private const CFUID_BUG_KEY = 'TL$BUG#71';

    protected function import(\DOMNode $node, $parent)
    {
        parent::import($node, $parent);

        //Essentially prevents magical conversion of CFDictionary to CFUid (which silently casts to int) for plain XMLs
        if ($parent instanceof CFDictionary) {
            $pv = $parent->getValue();
            if (isset($pv[self::CFUID_REF_KEY]) && \count($pv) === 1) {
                $parent->add(self::CFUID_BUG_KEY, new CFBoolean(true));
            }
        }
    }

    public function readBinaryObject()
    {
        $ret = parent::readBinaryObject();

        //Generates the CFDictionary back after it was squashed to CFUid for binary property lists
        if ($ret instanceof CFUid) {
            $ret = new CFDictionary([
                self::CFUID_REF_KEY => new CFNumber($ret->getValue()),
                self::CFUID_BUG_KEY => new CFBoolean(true)
            ]);
        }

        return $ret;
    }
}

Alternatively, a dirty reflection hack can be also implemented before the library is fixed, which doesn't cause any new keys to be added and ensures proper order is kept:

use CFPropertyList\CFUid;

class StrictCFUid extends CFUid
{
    public function toArray()
    {
        return ['CF$UID' => $this->getValue()];
    }
}
class StrictCFPropertyList extends CFPropertyList
{
    public function parseBinary($content = null)
    {
        parent::parseBinary($content);
        $this->fixSquashedCFUid($this);
    }

    protected function import(\DOMNode $node, $parent)
    {
        parent::import($node, $parent);
        if ($parent === $this) {
            $this->fixSquashedCFUid($this);
        }
    }

    private function fixSquashedCFUid(iterable $iter): void
    {
        $propRefs = [];

        foreach ($iter as $key => $item) {
            if (is_iterable($item)) {
                $this->fixSquashedCFUid($item);
            } elseif ($item instanceof CFUid) {
                $newCFUid = new StrictCFUid($item->getValue());
                if ($iter === $this) {
                    $this->value[$key] = $newCFUid;
                } elseif ($iter instanceof CFDictionary || $item instanceof CFArray) {
                    //CFDictionary has del() and add() but the setValue() is noop-ed, so there's no other way to swap it
                    // without altering the order
                    //CFArray has absolutely no way to replace a key and there order is crucial
                    $propRefs[$iter::class] ??= (new \ReflectionClass($iter::class))->getProperty('value');
                    $propRefs[$iter::class]->setAccessible(true);
                    $val = $propRefs[$iter::class]->getValue($iter);
                    $val[$key] = $newCFUid;
                    $propRefs[$iter::class]->setValue($iter, $val);
                } else {
                    throw new \RuntimeException('Unknown type ' . $iter::class);
                }
            }
        }
    }
}

Obviously this is not a long-term solution, but it's better than a custom fork.

@btry
Copy link
Collaborator

btry commented Mar 7, 2022

Hi

I'm busy for a while; I'll try to find some time to investigate on it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants