Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix custom dict overriding in collections #722

Open
wants to merge 27 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
e612058
clean up types
DanielYang59 Nov 13, 2024
03fc253
remove unused _CacheInfo
DanielYang59 Nov 13, 2024
9b11922
clean up comments and tests
DanielYang59 Nov 13, 2024
ecf19a5
add test for dict2namedtuple
DanielYang59 Nov 13, 2024
bad6938
add test_mongo_dict
DanielYang59 Nov 13, 2024
dbcc676
some minor comment clean up
DanielYang59 Nov 13, 2024
5f4bbea
remove unused test dir
DanielYang59 Nov 13, 2024
ecae809
keyerror to type error
DanielYang59 Nov 13, 2024
284ce88
add place holder
DanielYang59 Nov 13, 2024
affcfe9
docstring clean up
DanielYang59 Nov 27, 2024
0f979b0
clean up implementation and tests
DanielYang59 Nov 27, 2024
9e45e02
dont override |= to be consistent with frozenset
DanielYang59 Nov 27, 2024
74c4fd2
fix controlled dict init
DanielYang59 Nov 27, 2024
c2d3c20
more human readable error message
DanielYang59 Nov 27, 2024
7b96038
enhance test for AttrDict
DanielYang59 Nov 27, 2024
a29d69f
fix test_namespace_dict
DanielYang59 Nov 27, 2024
dc172c1
disable delete for namespace dict
DanielYang59 Nov 27, 2024
86a15c1
add method shadowing warning
DanielYang59 Nov 27, 2024
374d462
clean up FrozenAttrDict
DanielYang59 Nov 27, 2024
56c65ce
simplify FrozenAttrDict
DanielYang59 Nov 27, 2024
777da53
clarify class var name
DanielYang59 Nov 27, 2024
a356b12
fix mypy but not 100% sure
DanielYang59 Nov 27, 2024
5240544
remove todo tag
DanielYang59 Nov 27, 2024
04339d8
use object attribute to avoid recursion
DanielYang59 Nov 27, 2024
638d1fe
fix setdefault return type
DanielYang59 Nov 27, 2024
d575d59
fix typo in docstring
DanielYang59 Nov 27, 2024
a1d0a13
clean up test
DanielYang59 Nov 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
278 changes: 181 additions & 97 deletions src/monty/collections.py
Original file line number Diff line number Diff line change
@@ -1,17 +1,27 @@
"""
Useful collection classes, e.g., tree, frozendict, etc.
Useful collection classes:
- tree: A recursive `defaultdict` for creating nested dictionaries
with default values.
- ControlledDict: A base dict class with configurable mutability.
- frozendict: An immutable dictionary.
- Namespace: A dict doesn't allow changing values, but could
add new keys,
- AttrDict: A dict whose values could be access as `dct.key`.
- FrozenAttrDict: An immutable version of `AttrDict`.
- MongoDict: A dict-like object whose values are nested dicts
could be accessed as attributes.
"""

from __future__ import annotations

import collections
import warnings
from abc import ABC
from typing import TYPE_CHECKING

if TYPE_CHECKING:
from typing import Any, Iterable

from typing_extensions import Self


def tree() -> collections.defaultdict:
"""
Expand All @@ -20,126 +30,202 @@ def tree() -> collections.defaultdict:

Usage:
x = tree()
x['a']['b']['c'] = 1
x["a"]["b"]["c"] = 1

Returns:
A tree.
"""
return collections.defaultdict(tree)


class frozendict(dict):
class ControlledDict(collections.UserDict, ABC):
"""
A dictionary that does not permit changes. The naming
violates PEP8 to be consistent with standard Python's "frozenset" naming.
A base dictionary class with configurable mutability.

Attributes:
_allow_add (bool): Whether new keys can be added.
_allow_del (bool): Whether existing keys can be deleted.
_allow_update (bool): Whether existing keys can be updated.

Configurable Operations:
This class allows controlling the following dict operations (refer to
https://docs.python.org/3.13/library/stdtypes.html#mapping-types-dict for details):

- Adding or updating items:
- setter method: `__setitem__`
- `setdefault`
- `update`

- Deleting items:
- `del dict[key]`
- `pop(key)`
- `popitem`
- `clear()`
"""

def __init__(self, *args, **kwargs) -> None:
"""
Args:
args: Passthrough arguments for standard dict.
kwargs: Passthrough keyword arguments for standard dict.
"""
dict.__init__(self, *args, **kwargs)
_allow_add: bool = True
_allow_del: bool = True
_allow_update: bool = True

def __setitem__(self, key: Any, val: Any) -> None:
raise KeyError(f"Cannot overwrite existing key: {str(key)}")
def __init__(self, *args, **kwargs) -> None:
"""Temporarily allow add during initialization."""
original_allow_add = self._allow_add

def update(self, *args, **kwargs) -> None:
"""
Args:
args: Passthrough arguments for standard dict.
kwargs: Passthrough keyword arguments for standard dict.
"""
raise KeyError(f"Cannot update a {self.__class__.__name__}")
try:
self._allow_add = True
super().__init__(*args, **kwargs)
finally:
self._allow_add = original_allow_add

# Override add/update operations
def __setitem__(self, key, value) -> None:
"""Forbid adding or updating keys based on _allow_add and _allow_update."""
if key not in self.data and not self._allow_add:
raise TypeError(f"Cannot add new key {key!r}, because add is disabled.")
elif key in self.data and not self._allow_update:
raise TypeError(f"Cannot update key {key!r}, because update is disabled.")

class Namespace(dict):
Copy link
Contributor Author

@DanielYang59 DanielYang59 Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, Namespace is a dict that only allows adding new items but rejects updating existing items, however I don't really understand why it's called Namespace?

class Namespace(dict):
"""A dictionary that does not permit to redefine its keys."""


The original implementation also allows "delete" item and current implementation would NOT allow such operation:

  • To agree with the docstring that "does not permit to redefine its keys" (apparently delete would alter existing keys)
  • Prevent "hacky" way to change values by "delete and add back" (though the code is designed to protect against such attempt to modify its value, as the mutability switch is not truly "private" in Python)

"""A dictionary that does not permit to redefine its keys."""
super().__setitem__(key, value)

def __init__(self, *args, **kwargs) -> None:
"""
Args:
args: Passthrough arguments for standard dict.
kwargs: Passthrough keyword arguments for standard dict.
def update(self, *args, **kwargs) -> None:
"""Forbid adding or updating keys based on _allow_add and _allow_update."""
for key in dict(*args, **kwargs):
if key not in self.data and not self._allow_add:
raise TypeError(
f"Cannot add new key {key!r} using update, because add is disabled."
)
elif key in self.data and not self._allow_update:
raise TypeError(
f"Cannot update key {key!r} using update, because update is disabled."
)

super().update(*args, **kwargs)

def setdefault(self, key, default=None) -> None:
"""Forbid adding or updating keys based on _allow_add and _allow_update.

Note: if not _allow_update, this method would NOT check whether the
new default value is the same as current value for efficiency.
"""
self.update(*args, **kwargs)
if key not in self.data:
if not self._allow_add:
raise TypeError(
f"Cannot add new key using setdefault: {key!r}, because add is disabled."
)
elif not self._allow_update:
raise TypeError(
f"Cannot update key using setdefault: {key!r}, because update is disabled."
)

return super().setdefault(key, default)
DanielYang59 marked this conversation as resolved.
Show resolved Hide resolved

# Override delete operations
def __delitem__(self, key) -> None:
"""Forbid deleting keys when self._allow_del is False."""
if not self._allow_del:
raise TypeError(f"Cannot delete key {key!r}, because delete is disabled.")
super().__delitem__(key)

def pop(self, key, *args):
"""Forbid popping keys when self._allow_del is False."""
if not self._allow_del:
raise TypeError(f"Cannot pop key {key!r}, because delete is disabled.")
return super().pop(key, *args)

def popitem(self):
"""Forbid popping the last item when self._allow_del is False."""
if not self._allow_del:
raise TypeError("Cannot pop item, because delete is disabled.")
return super().popitem()

def clear(self) -> None:
"""Forbid clearing the dictionary when self._allow_del is False."""
if not self._allow_del:
raise TypeError("Cannot clear dictionary, because delete is disabled.")
super().clear()


class frozendict(ControlledDict):
"""
A dictionary that does not permit changes. The naming
violates PEP 8 to be consistent with the built-in `frozenset` naming.
"""

def __setitem__(self, key: Any, val: Any) -> None:
if key in self:
raise KeyError(f"Cannot overwrite existent key: {key!s}")
_allow_add: bool = False
_allow_del: bool = False
_allow_update: bool = False

dict.__setitem__(self, key, val)

def update(self, *args, **kwargs) -> None:
"""
Args:
args: Passthrough arguments for standard dict.
kwargs: Passthrough keyword arguments for standard dict.
"""
for k, v in dict(*args, **kwargs).items():
self[k] = v
class Namespace(ControlledDict): # TODO: this name is a bit confusing, deprecate it?
"""A dictionary that does not permit update/delete its values (but allows add)."""

_allow_add: bool = True
_allow_del: bool = False
_allow_update: bool = False


class AttrDict(dict):
"""
Allows to access dict keys as obj.foo in addition
to the traditional way obj['foo']"
Allow accessing values as `dct.key` in addition to the traditional way `dct["key"]`.

Examples:
>>> d = AttrDict(foo=1, bar=2)
>>> assert d["foo"] == d.foo
>>> d.bar = "hello"
>>> assert d.bar == "hello"
>>> dct = AttrDict(foo=1, bar=2)
>>> assert dct["foo"] is dct.foo
>>> dct.bar = "hello"

Warnings:
When shadowing dict methods, e.g.:
>>> dct = AttrDict(update="value")
>>> dct.update() # TypeError (the `update` method is overwritten)

References:
https://stackoverflow.com/a/14620633/24021108
"""

def __init__(self, *args, **kwargs) -> None:
"""
Args:
args: Passthrough arguments for standard dict.
kwargs: Passthrough keyword arguments for standard dict.
"""
super().__init__(*args, **kwargs)
super(AttrDict, self).__init__(*args, **kwargs)
self.__dict__ = self

def copy(self) -> Self:
"""
Returns:
Copy of AttrDict
"""
newd = super().copy()
return self.__class__(**newd)
def __setitem__(self, key, value) -> None:
"""Check if the key shadows dict method."""
if key in dir(dict):
warnings.warn(
f"'{key=}' shadows dict method. This may lead to unexpected behavior.",
UserWarning,
stacklevel=2,
)
super().__setitem__(key, value)


class FrozenAttrDict(frozendict):
"""
A dictionary that:
* does not permit changes.
* Allows to access dict keys as obj.foo in addition
to the traditional way obj['foo']
- Does not permit changes (add/update/delete).
- Allows accessing values as `dct.key` in addition to
the traditional way `dct["key"]`.
"""

def __init__(self, *args, **kwargs) -> None:
"""
Args:
args: Passthrough arguments for standard dict.
kwargs: Passthrough keyword arguments for standard dict.
"""
"""Allow add during init, as __setattr__ is called unlike frozendict."""
self._allow_add = True
super().__init__(*args, **kwargs)
self._allow_add = False

def __getattribute__(self, name: str) -> Any:
try:
return super().__getattribute__(name)
except AttributeError:
try:
return self[name]
except KeyError as exc:
raise AttributeError(str(exc))
return self[name]

def __setattr__(self, name: str, value: Any) -> None:
DanielYang59 marked this conversation as resolved.
Show resolved Hide resolved
raise KeyError(
f"You cannot modify attribute {name} of {self.__class__.__name__}"
)
if not self._allow_add and name != "_allow_add":
raise TypeError(
f"{self.__class__.__name__} does not support item assignment."
)
super().__setattr__(name, value)

def __delattr__(self, name: str) -> None:
raise TypeError(f"{self.__class__.__name__} does not support item deletion.")


class MongoDict:
Expand All @@ -151,11 +237,11 @@ class MongoDict:
a nested dict interactively (e.g. documents extracted from a MongoDB
database).

>>> m = MongoDict({'a': {'b': 1}, 'x': 2})
>>> assert m.a.b == 1 and m.x == 2
>>> assert "a" in m and "b" in m.a
>>> m["a"]
{'b': 1}
>>> m_dct = MongoDict({"a": {"b": 1}, "x": 2})
>>> assert m_dct.a.b == 1 and m_dct.x == 2
>>> assert "a" in m_dct and "b" in m_dct.a
>>> m_dct["a"]
{"b": 1}

Notes:
Cannot inherit from ABC collections.Mapping because otherwise
Expand Down Expand Up @@ -186,7 +272,6 @@ def __getattribute__(self, name: str) -> Any:
try:
return super().__getattribute__(name)
except AttributeError:
# raise
try:
a = self._mongo_dict_[name]
if isinstance(a, collections.abc.Mapping):
Expand Down Expand Up @@ -214,26 +299,25 @@ def __dir__(self) -> list:

def dict2namedtuple(*args, **kwargs) -> tuple:
"""
Helper function to create a class `namedtuple` from a dictionary.
Helper function to create a `namedtuple` from a dictionary.

Examples:
>>> t = dict2namedtuple(foo=1, bar="hello")
>>> assert t.foo == 1 and t.bar == "hello"
>>> tpl = dict2namedtuple(foo=1, bar="hello")
>>> assert tpl.foo == 1 and tpl.bar == "hello"

>>> t = dict2namedtuple([("foo", 1), ("bar", "hello")])
>>> assert t[0] == t.foo and t[1] == t.bar
>>> tpl = dict2namedtuple([("foo", 1), ("bar", "hello")])
>>> assert tpl[0] is tpl.foo and t[1] is tpl.bar
Copy link
Contributor Author

@DanielYang59 DanielYang59 Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we should check identity (is) instead of value equality (==), as both should point to the same object.

DanielYang59 marked this conversation as resolved.
Show resolved Hide resolved

Warnings:
- The order of the items in the namedtuple is not deterministic if
kwargs are used.
namedtuples, however, should always be accessed by attribute hence
this behaviour should not represent a serious problem.
`kwargs` are used. namedtuples, however, should always be accessed
by attribute hence this behaviour should not be a serious problem.

- Don't use this function in code in which memory and performance are
crucial since a dict is needed to instantiate the tuple!
- Don't use this function in code where memory and performance are
crucial, since a dict is needed to instantiate the tuple!
"""
d = collections.OrderedDict(*args)
d.update(**kwargs)
dct = collections.OrderedDict(*args)
dct.update(**kwargs)
return collections.namedtuple(
typename="dict2namedtuple", field_names=list(d.keys())
)(**d)
typename="dict2namedtuple", field_names=list(dct.keys())
)(**dct)
7 changes: 5 additions & 2 deletions src/monty/design_patterns.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,12 @@
import inspect
import os
from functools import wraps
from typing import Any, Dict, Hashable, Tuple, TypeVar
from typing import TYPE_CHECKING, TypeVar
from weakref import WeakValueDictionary

if TYPE_CHECKING:
from typing import Any


def singleton(cls):
"""
Expand Down Expand Up @@ -95,7 +98,7 @@ def new_init(self: Any, *args: Any, **kwargs: Any) -> None:
orig_init(self, *args, **kwargs)
self._initialized = True

def reduce(self: Any) -> Tuple[type, Tuple, Dict[str, Any]]:
def reduce(self: Any) -> tuple[type, tuple, dict[str, Any]]:
for key, value in cache.items():
if value is self:
cls, args = key
Expand Down
Loading