Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add __contains__? #108

Open
SimonHeybrock opened this issue Jan 23, 2024 · 12 comments
Open

Add __contains__? #108

SimonHeybrock opened this issue Jan 23, 2024 · 12 comments
Assignees
Labels
enhancement New feature or request

Comments

@SimonHeybrock
Copy link
Member

SimonHeybrock commented Jan 23, 2024

I could be useful to be able to check if a pipeline can compute a certain product, e.g.,

assert WavelengthBands not in pipeline

We have to be careful though to keep a clear class interface. Would this make sense conceptually, in the context of what __setitem__ and __getitem__ do (or would do)?

@jl-wynen
Copy link
Member

We could add a named method instead of overloading __contains__ to avoid conceptual issues.

@SimonHeybrock
Copy link
Member Author

I think we should add __contains__, returning whether a provider exists, ignoring whether its dependencies are fulfilled.

We can consider adding a named method for computability, but we should probably postpone that until we have a clearer picture about if and how we may use a graph structure inside Pipeline.

@jl-wynen
Copy link
Member

jl-wynen commented Feb 9, 2024

I don't understand. This sounds like you want to go for the option that may have conceptual issues in favour of the one where that is less of a problem with the justification that we're unsure about the interface. This seems backwards. Can yo uexplain?

@SimonHeybrock
Copy link
Member Author

No, I am suggesting to not implement the one that has conceptual issues. Instead implement a plain companion to __setitem__ (and insert) and __getitem__ that essentially just returns tp in self._providers (modulo complications for generic providers).

@jl-wynen
Copy link
Member

How does this differ from your original post? It is a check for whether the pipeline contains a provider that can produce the given type. This is how I understood your original idea.

@SimonHeybrock
Copy link
Member Author

The original suggested to return whether the pipeline can compute the result. This implies a transitive check whether all dependencies of the provider are fulfilled.

@jl-wynen
Copy link
Member

I see. I'm definitely against supporting this, I never even considered that you could mean this.

But the conceptual issue I was talking about is that, with your suggestion, Pipeline is treated either as a container of keys or as a mapping from keys to providers. This is not quite how the rest of it works. __setitem__ only works with parameters, i.e., a subset of the mapping. And the other methods, including insert, get, build, compute, the mapping is an implementation detail.
So I'm leaning towards not using a __contains__ method here.

@SimonHeybrock
Copy link
Member Author

So you are suggesting to also not implement #98 (__getitem__) as suggested?

Pipeline is treated either as a container of keys or as a mapping from keys to providers. This is not quite how the rest of it works. __setitem__ only works with parameters, i.e., a subset of the mapping.

I think we could easily extend __setitem__ to also work with other providers. insert exists mainly to avoid having to specify types twice, but otherwise it does the same.

@jl-wynen
Copy link
Member

I think we could easily extend setitem to also work with other providers. insert exists mainly to avoid having to specify types twice, but otherwise it does the same.

We could. But this is a bit tricky. How does __setitem__ know whether it is given a provider or a parameter? The user might want to insert a callable as a parameter (#122). You'd have to compare the given key with the return annotation of the callable. And even that could, in principle, be ambiguous.

So you are suggesting to also not implement #98 (getitem) as suggested?

Actually, yes. That is, unless we want Pipeline to be a mapping. My intuition is to say that it shouldn't be. But maybe I'm missing some arguments.

@SimonHeybrock
Copy link
Member Author

I have always thought of it like a mapping from named concept -> how is it defined/computed.

Parameter tables are the odd one out, but otherwise Pipeline is pretty much that, internally?

@jl-wynen
Copy link
Member

Yes, internally. My concern is with whether we want to enshrine that in the interface.

@SimonHeybrock
Copy link
Member Author

With #172 one can simply use something like MyType in pipeline.networkx_graph. Unless we want to enhance the __contains__ method with some extra features, such as working with mapped-nodes, there may be little left to do here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants