-
-
Notifications
You must be signed in to change notification settings - Fork 630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propagate unknown
when an explicit flag is set, in the form unknown=...|PROPAGATE
#1634
Propagate unknown
when an explicit flag is set, in the form unknown=...|PROPAGATE
#1634
Conversation
@sloria, could I push for your (or another marshmallow maintainer's) feedback on this approach? The implementation aside, I'm not sure if this is the way to solve this problem. |
Apologies for the delay. Unfortunately I won't be able to give this the focused attention this needs in the next week or so--lots of work and non-work commitments coming up fast. I thought we'd decided against this behavior in past proposals:
And potential solutions requiring no changes to marshmallow:
|
This does work around some of the issues with past proposals, but it illustrates how much API surface area has to be added to get this behavior. It takes significantly less code to use a factory pattern, and it covers more complex use cases. def get_schema(unknown=None):
class D(Schema):
x = fields.Str()
class C(Schema):
x = fields.Str()
class B(Schema):
d = fields.Nested(D(unknown=unknown or EXCLUDE))
c = fields.Nested(C)
class A(Schema):
b1 = fields.Nested(B(unknown=unknown or INCLUDE))
b2 = fields.Nested(B())
return A
A = get_schema(unknown=RAISE)
A().load(...) |
The factory pattern or a custom base schema probably covers all usage. I think the 99% case for users wanting this is this simple: a user wants to set
Not having participated in the past threads, the fact that other PRs for this have been left open led me to believe the door was being intentionally left open to a better approach. Even that aside, users are somewhat consistently surprised by the fact that
I take your comment to also mean "changing the API this much makes things worse / more confusing". Which is something with which I agree. I can work to smooth out the public API by making the values for sample of an or-able UnknownParam classclass UnknownParam:
def __init__(self, value=None, propagate=False):
self.value = value
self.propagate = propagate
def __or__(self, other):
return UnknownParam(
self.value or other.value, self.propagate or other.propagate
)
def __str__(self):
if self.value:
return self.value
if self.propagate:
return "propagate"
return "null"
def __repr__(self):
return "UnknownParam(value={!r}, propagate={!r})".format(
self.value, self.propagate
)
def is_good(self):
return self.value is not None
EXCLUDE = UnknownParam("exclude")
INCLUDE = UnknownParam("include")
RAISE = UnknownParam("raise")
PROPAGATE = UnknownParam(propagate=True) I'm happy to rework this existing PR to use that, if it would stand a chance at being accepted. As far as I'm concerned, that's just sugar -- which I'm happy to work on, since the public API is important! -- but I'd just be doing the same thing internally. I don't want to do that work unless we agree that allowing users to opt-in to propagation like this is desirable in the first place. If no functional change is going to be made, I think something explicit needs to be added somewhere in the docs. I would be inclined to phrase it positively, in terms of how to apply suggested addition to quickstart docs
|
To clarify: I think the number of method arguments added is disproportionate to the benefit it provides. I generally promote providing all the data needed to satisfy special use cases rather than implementing them in the core. A leaner core is better for performance and reduces the complexity of maintaining and contributing to the library.
I think this can be done without modifying the core. If you are open to solving this use case with a utility and think a convenient interface for this functionality would be useful to other users in the community, would you consider publishing it as a package? I think that would be a good addition to the "Fields and Custom Schemas" section of the Ecosystem page of the wiki. |
It's possible to do this in a separate lib, but if it's not going into the core, then the solutions of using a custom base schema or writing a schema builder are much better. I think I misunderstood the situation -- with the existing PRs and issues still open, I thought this was something where I could close out a series of issues. It seems like none of these are actually wanted. Can they all be closed, along with this? If it's useful, I can open a new PR with docs as I suggested. That way, the |
105702c
to
a990005
Compare
unknown
when an explicit flag, propagate_unknown, is setunknown
when an explicit flag is set, in the form unknown=...|PROPAGATE
This adds a new option, `propagate_unknown` which defaults to False. It controls the behavior of `unknown` with respect to nested structures. Anywhere that `unknown` can be set, `propagate_unknown` can be set. That means it can be applied to a schema instance, a load call, schema.Meta, or to fields.Nested . When set, nested deserialize calls will get the same value for `unknown` which their parent call got and they will receive `propagate_unknown=True`. The new flag is completely opt-in and therefore backwards compatible with any current usages of marshmallow. Once you opt in to this behavior on a schema, you don't need to worry about making sure it's set by nested schemas that you use. In the name of simplicity, this sacrifices a bit of flexibility. A schema with `propagate_unknown=True, unknown=...` will override the `unknown` settings on any of its child schemas. Tests cover usage as a schema instantiation arg and as a load arg for some simple data structures.
propagate_unknown will still traverse any series of nested documents, meaning that once you set propagate_unknown=True, it is true for the whole schema structure. However, this introduces tracking for whether or not `unknown` was set explicitly. If `unknown=RAISE` is set because no value was specified, we will set a new flag on the schema, `auto_unknown=True`. propagate_unknown now has the following behavior: - if the nested schema has auto_unknown=False, use the current value for `unknown` in the nested `load` call - if a nested field has its `unknown` attribute set, use that in place of any value sent via `propagate_unknown` Effectively, this means that if you set `unknown` explicitly anywhere in a nested schema structure, it will propagate downwards from that point. Combined with the fact that propagate_unknown=True propagates downwards across all schema barriers, including if `propagate_unknown=False` is set explicitly somewhere, this could be confusing. However, because the idea is for `propagate_unknown=True` to eventually be the only supported behavior for marshmallow, this is acceptable as a limitation. auto_unknown is an attribute of schema opts and of schema instances, with the same kind of inheritance behavior as other fields.
The changelog entry, including credit to prior related work, covers the closed issues and describes how `propagate_unknown` is expected to behave. Inline documentation attempts to strike a balance between clarify and brevity. closes marshmallow-code#1428, marshmallow-code#1429, marshmallow-code#1490, marshmallow-code#1575
Rather than having a new parameter which allows `propagate_unknown` behavior, convert this into an option which can be set on the ``unknown`` parameter itself. Drawing from other interfaces which present flags as or-able values (traditionally bitmasks), express the setting of this option as `x | PROPAGATE` for any supported `x`. This makes it harder to propagate the value of `propagate` separately from the rest of the value of `unknown`. For simplicity, change the behavior here to suit this implementation.
In order to simplify, the value of `unknown` will not be respected in child schemas if `PROPAGATE` is used. This removes any need for the `auto_unknown` tracking, so there's less complexity added to schemas in order to implement this behavior. Adjust tests and changelog to match. Also make minor tweaks to ensure UnknownParam usage is consistent and keep the diff with the dev branch smaller.
a990005
to
499d608
Compare
I don't want to nag about this issue, but I also feel that the first draft of this put it on poor footing by adding a lot more API surface and including an inessential feature in the I've refactored it to add I'm still not strongly attached to making the change, but I think this is the simplest/smallest version of it I can imagine which won't break any normal usages today. |
This comment has been minimized.
This comment has been minimized.
Completely agreed, sounds good! +1 will greatly appreciate having Propagate as an option. |
For the people looking how to solve the issue, this helps me: "user": fields.Nested(
{"id": fields.Int(required=True),
"login": fields.Str(required=True),
"url": fields.Str()
},
required=True,
unknown=EXCLUDE), Adding unknown=EXCLUDE inside the fields.Nested to make sure that the Nested field ignore other data coming from the request. |
I didn't think this is going to be merged, and keeping it open is just noise. Not sure why I didn't close it before. I think it would be best if the other PRs for propagation of |
This is based upon the work in #1490 , but with some significant modifications.
In retrospect, it might have been simpler to base it on #1429 , or just do the work from scratch, but I'm not sure.
closes #1428, #1429, #1490, and #1575
The biggest change from #1490 or #1429 I've made is that the behavior is never implied by certain usages -- it's always explicit. There's no distinction between
unknown
being passed at load time, toNested
, at schema instantiation, or via schema options. Instead, all of these contexts will supportpropagate_unknown=True
, which turns on propagation.This approach might not be all that elegant -- it somewhat bloats the public interface -- but it is guaranteed to be backwards compatible. Because the behavior is opt-in, nobody should have behavior change on them unexpectedly with this update.
Quick note: I wasn't sure if it is desirable to allow
propagate_unknown
in all of these contexts, especiallyNested
. I included it for now, but would be open to removing it.propagate_unknown
as written does not respect settingpropagate_unknown=False
in a nested schema. It's not something I considered particularly important (since the whole point is to "turn the feature on") but I can look at tweaking that if it's likely to be confusing.One important bit of this is that I wanted
propagate_unknown
to understand the difference betweenunknown=RAISE
being set because it's the default, and someone writingfields.Nested(MySchema(unknown=RAISE))
, in which case they're asking for it explicitly. To handle this,__init__
checks before settingunknown
. If no value was passed, in addition to settingunknown=RAISE
, it will set a new value,auto_unknown=True
.During loading, whenever
propagate_unknown=True
andauto_unknown=False
, theunknown
value for the nested schema in question will be respected. Furthermore, that value will start to propagate. The same happens, though without use ofauto_unknown
, whereverfields.Nested.unknown
is set.Reading this comment on #1429, my understanding was that handling these kinds of scenarios was considered to be necessary. With the
auto_unknown
tracking and behavior, it's possible to support cases like this one:In the above, everything should "percolate down" in a relatively intuitive way. Importantly, that top-level value for
unknown=RAISE
doesn't override the use ofunknown=INCLUDE
andunknown=EXCLUDE
in descendant schemas.I've added a few simple tests, but I haven't aimed to be completely comprehensive because there are too many scenarios to think about. It's hard to tell what's important to test this way and what is not. If there's something I ought to be testing which is missing, just say that word and I'm happy to add more test cases.
This is my first time touching
marshmallow
, and I'm changing the interfaces forSchema
,fields.Nested
, and schema opts. So I would not be at all surprised if I've missed something significant -- please just let me know if so.