-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] How to map an operation to treespec
instead of only node
#173
Comments
Hi @lqhuang, the In [1]: import optree
In [2]: treespec = optree.tree_structure({'a': 1, 'b': 2, 'c': {'d': 3, 'e': 4}})
In [3]: treespec
Out[3]: PyTreeSpec({'a': *, 'b': *, 'c': {'d': *, 'e': *}})
In [4]: hash(treespec)
Out[4]: 2985958577142014292
In [5]: import pickle
In [6]: pickle.dumps(treespec)
Out[6]: b'\x80\x04\x95\xaa\x00\x00\x00\x00\x00\x00\x00\x8c\x06optree\x94\x8c\nPyTreeSpec\x94\x93\x94)\x81\x94((K\x01K\x00NNNK\x01K\x01Nt\x94(K\x01K\x00NNNK\x01K\x01Nt\x94(K\x01K\x00NNNK\x01K\x01Nt\x94(K\x01K\x00NNNK\x01K\x01Nt\x94(K\x05K\x02]\x94(\x8c\x01d\x94\x8c\x01e\x94eNNK\x02K\x03]\x94(h\th\net\x94(K\x05K\x03]\x94(\x8c\x01a\x94\x8c\x01b\x94\x8c\x01c\x94eNNK\x04K\x06]\x94(h\x0eh\x0fh\x10et\x94t\x94\x89\x8c\x00\x94\x87\x94b.'
In [7]: pickle.loads(pickle.dumps(treespec))
Out[7]: PyTreeSpec({'a': *, 'b': *, 'c': {'d': *, 'e': *}}) Also, have you ever tried the paths and accessors, they are also hashable and pickleable. In [8]: treespec.paths()
Out[8]: [('a',), ('b',), ('c', 'd'), ('c', 'e')]
In [9]: treespec.accessors()
Out[9]:
[
PyTreeAccessor(*['a'], (MappingEntry(key='a', type=<class 'dict'>),)),
PyTreeAccessor(*['b'], (MappingEntry(key='b', type=<class 'dict'>),)),
PyTreeAccessor(*['c']['d'], (MappingEntry(key='c', type=<class 'dict'>), MappingEntry(key='d', type=<class 'dict'>))),
PyTreeAccessor(*['c']['e'], (MappingEntry(key='c', type=<class 'dict'>), MappingEntry(key='e', type=<class 'dict'>)))
] |
@XuehaiPan Hey, Yeah, I don't want to just serialize a PyTree, yeah it's not an obstacle. I want to a I will try it later. Thanks for your hints! 🥹 Best wishes. Has a happy weekend! |
I have read the documents and example of https://optree.readthedocs.io/en/latest/ops.html#optree.tree_map_with_accessor. I think what I need is a little slightly different with Likewise, Here is an example for >>> tree_map_with_key(lambda k, x: k * 2, {'x': 7, 'y': (42, 64), 'z': None})
{'xx': 8, 'yy': (42, 64), 'zz': None} That means the function or the operator is applied on |
One thing I want to point out is that a pytree is not narrowed to the nested dictionary. It can be an arbitrary nested container or an opaque object. In your example, it is not well-defined to change the path if the entry is a sequence index. To change the dictionary keys, you can do: In [1]: import optree
In [2]: tree = {'a': 1, 'b': 2, 'c': {'d': 3, 'e': 4}}
In [3]: def double(x):
...: return x * 2
...:
In [4]: def is_dict(x):
...: return type(x) is dict
...:
In [5]: def double_dict_keys(x):
...: if is_dict(x):
...: return {double(k): double_dict_keys(v) for k, v in x.items()}
...: return x
...:
In [6]: optree.tree_map(double_dict_keys, tree, is_leaf=is_dict)
Out[6]: {'aa': 1, 'bb': 2, 'cc': {'dd': 3, 'ee': 4}} |
Yeah, it's ok in my cases. It would raise a
Could you give me a more concrete example about input data? I think I will do nothing when the entry is a sequence index if I find treespec point it's a sequence like data container.
Here is a question about And Thanks for your demo for changing the dictionary keys, good example! |
In [1]: import optree
In [2]: import collections
In [3]: MyTuple = collections.namedtuple('MyTuple', ['x', 'y', 'z'])
In [4]: tree = {'a': 1, 'b': (2, MyTuple(3, 4, 5), [6, 7]), 'c': {'d': (8,), 'e': [9, 0]}}
In [5]: optree.tree_structure(tree)
Out[5]: PyTreeSpec({'a': *, 'b': (*, MyTuple(x=*, y=*, z=*), [*, *]), 'c': {'d': (*,), 'e': [*, *]}})
In [6]: optree.tree_leaves(tree)
Out[6]: [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
In [7]: optree.tree_paths(tree)
Out[7]: [('a',), ('b', 0), ('b', 1, 0), ('b', 1, 1), ('b', 1, 2), ('b', 2, 0), ('b', 2, 1), ('c', 'd', 0), ('c', 'e', 0), ('c', 'e', 1)]
In [8]: optree.tree_accessors(tree)
Out[8]:
[
PyTreeAccessor(*['a'], (MappingEntry(key='a', type=<class 'dict'>),)),
PyTreeAccessor(*['b'][0], (MappingEntry(key='b', type=<class 'dict'>), SequenceEntry(index=0, type=<class 'tuple'>))),
PyTreeAccessor(*['b'][1].x, (MappingEntry(key='b', type=<class 'dict'>), SequenceEntry(index=1, type=<class 'tuple'>), NamedTupleEntry(field='x', type=<class '__main__.MyTuple'>))),
PyTreeAccessor(*['b'][1].y, (MappingEntry(key='b', type=<class 'dict'>), SequenceEntry(index=1, type=<class 'tuple'>), NamedTupleEntry(field='y', type=<class '__main__.MyTuple'>))),
PyTreeAccessor(*['b'][1].z, (MappingEntry(key='b', type=<class 'dict'>), SequenceEntry(index=1, type=<class 'tuple'>), NamedTupleEntry(field='z', type=<class '__main__.MyTuple'>))),
PyTreeAccessor(*['b'][2][0], (MappingEntry(key='b', type=<class 'dict'>), SequenceEntry(index=2, type=<class 'tuple'>), SequenceEntry(index=0, type=<class 'list'>))),
PyTreeAccessor(*['b'][2][1], (MappingEntry(key='b', type=<class 'dict'>), SequenceEntry(index=2, type=<class 'tuple'>), SequenceEntry(index=1, type=<class 'list'>))),
PyTreeAccessor(*['c']['d'][0], (MappingEntry(key='c', type=<class 'dict'>), MappingEntry(key='d', type=<class 'dict'>), SequenceEntry(index=0, type=<class 'tuple'>))),
PyTreeAccessor(*['c']['e'][0], (MappingEntry(key='c', type=<class 'dict'>), MappingEntry(key='e', type=<class 'dict'>), SequenceEntry(index=0, type=<class 'list'>))),
PyTreeAccessor(*['c']['e'][1], (MappingEntry(key='c', type=<class 'dict'>), MappingEntry(key='e', type=<class 'dict'>), SequenceEntry(index=1, type=<class 'list'>)))
]
Changing the key of the dictionary already changed the tree structure. This |
treespec
instead of only node
treespec
instead of only node
I'm changing this issue to Question instead of feature request because it is difficult to give an API definition to general pytree (beyond nested dictionary). |
Sorry, I will try to give more basic examples and
try to write first sensible function signature. Thanks for your kind help |
Required prerequisites
Motivation
I'm designing a lib for serializing and deserializing Python structured data class (like
dataclass
orattrs
). I realized the idea and operations defined by PyTree are especially suitable for this purpose.Basically, serializing a Python container into a specific protocol (like JSON or YAML
) is flattening a custom data container into primitive and then reconstruct back to serializable native container like
dict/
list/
tuple`.That's totally the same with what
tree_map
did (flatten → traverse → unflatten) but with customized flatten and unflatten functions.And in previous implementations (see xattrs _struct_funcs.py#L61 or attrs _funcs.py#L11), it strongly depends on explicit recursions, which are quite cumbersome and hard to debug. I think the
PyTree
way will make all these easier.I have researched all major
PyTree
implementations, even thinking about writing it by myself, but I foundoptree
now (@XuehaiPan excellent)! Unfortunately, in my practice, a real serde system is often require doing some transforms onkey
fields, but current operations don't support this feature yet.For example, in the above codes,
key_serializer
would be applied onkey
(treespec) to transform case (e.g.: fromfull_name
tofullName
), andvalue_serializer
would be applied onvalue
(all leaves or nodes) to decide how value to be serialized.Solution
Add some functions like
optree.tree_map_on_treespec
?I'm still thinking the concrete signature.
Alternatives
No response
Additional context
I know it's an unusual or even “unused” feature (for its original purpose). If you would like to accept the proposal, I'm glad to try to implement first MVP under your guidance. And I need to know where should I start to investigate first.
The text was updated successfully, but these errors were encountered: