Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design of super() encourages tight coupling between implementation code and class hierarchy #518

Open
mjskay opened this issue Dec 29, 2024 · 1 comment

Comments

@mjskay
Copy link

mjskay commented Dec 29, 2024

I'm expanding this issue from this comment since I think it is a separate issue: #493 (comment)

Having used S7 to build out an API making extensive use of multiple dispatch, I'd like to offer some thoughts and a suggestion for a minor change in the API that I think would make it easier to use. I think this suggested change is in keeping with the explicitness that motivated super() in the first place, but which improves maintainability of code both within and (importantly) across packages.

The problem: super() has two meanings in use and one is tightly coupled to the class hierarchy

I think super() is misnamed, and its suggested method of use creates a tight coupling between implementation code and the class hierarchy.

Consider a class hierarchy with base class BaseClass, child Child, and a binary operator op:

BaseClass = new_class("BaseClass")
Child = new_class("Child", parent = BaseClass)

op = new_generic("op", c("x", "y"))

method(op, list(BaseClass, BaseClass)) = \(x, y) {
  # do stuff on BaseClass
}

If I had an implementation for op on Child with any other BaseClass (ignore commutativity --- in practice I would also have an implementation for (BaseClass, Child)), it might look something like:

method(op, list(Child, BaseClass)) = \(x, y) {
  # do stuff on Child, then...
  op(super(x, BaseClass), super(y, BaseClass))
}

Semantically, within the context of this method implementation, the line op(super(x, BaseClass), super(y, BaseClass)) has two different modes of dispatch:

  1. super(x, BaseClass) for x, a Child (or subclass of Child): this is used to mean "dispatch one class up in the hierarchy" or "dispatch to the parent of Child".
  2. super(y, BaseClass) for y, a BaseClass (or a subclass of BaseClass): this is used to mean "dispatch on the same class again" or "dispatch on BaseClass".

The second meaning works well, since as the implementor of this method I know that BaseClass is precisely the class I want to target for dispatch of y.

However, the first meaning couples the class hierarchy as it currently stands with this implementation of the method op. If the hierarchy of Child changes at some point in the future, as an implementor I must also update every method of Child relying on this hierarchy to dispatch to a parent class. For example, say we later add a class in between BaseClass and Child, changing the parent of Child:

IntermediateClass = new_class("IntermediateClass", parent = BaseClass)
Child = new_class("Child", parent = IntermediateClass)

Now the definition of op should be:

method(op, list(Child, BaseClass)) = \(x, y) {
  # do stuff on Child, then...
  op(super(x, IntermediateClass), super(y, BaseClass))
}

Besides creating maintenance issues within a single package, if BaseClass and Child are implemented in one package and op in another, now we have a tight coupling across package boundaries, creating even worse maintenance issues.

Solution 1: encourage use of @parent with super() for use case 1

One solution would be to implement op as:

method(op, list(Child, BaseClass)) = \(x, y) {
  # do stuff on Child, then...
  op(super(x, Child@parent), super(y, BaseClass))
}

This works, though not on S3 classes since S7_S3_class doesn't have @parent. This solution could be adopted just by updating some documentation and recommended use, and probably also by adding a unified interface for getting a class's parent (which would be helpful to have anyway).

However, to me this reveals another issue: super() is not well-named. The first operation (super(x, Child@parent)) is much closer to what I would normally think of as a super() operation (dispatch on the superclass), and it feels redundant to write both super and @parent; on the other hand, the second operation (super(y, BaseClass)) is more like "dispatch on exactly this class", so the use of the word super feels incorrect. This motivates my second suggestion.

Solution 2: split super() into two operations

My suggestion is to split super() into two operations, each reflecting their appropriate use in context. Something like dispatch_up() or dispatch_parent() for the first operation, which would dispatch to the parent of the second argument, and something like dispatch() for what is currently called super().

FWIW, I have been using helpers defining dispatch_up() and dispatch() in this way (also addressing dispatch on class unions as described in #493 (comment)), and it has already saved me maintenance work when refactoring a class hierarchy. Had I not done so I would have had to refactor a few dozen methods for a single change in the class hierarchy, but using dispatch_up() I didn't have to touch those methods at all. The code also reads more clearly: I know precisely which of the two uses of super is intended by a call based on what function is used.

@lawremi
Copy link
Collaborator

lawremi commented Jan 5, 2025

Inheritance in general introduces strong coupling. The opposite argument could be made: if references to the parent class are dynamic, incompatible logic could be introduced that the child wants to override. C++ also uses explicit class references when calling inherited methods. While e.g. Java's super keyword is more flexible, any explicit delegation along the hierarchy strengthens coupling and arguably should be avoided. This becomes perhaps even more problematic in the functional paradigm, where multiple arguments are being considered, and any manipulation of dispatch could make the logic even harder to understand. For example, whether realistic or not, the example of calling super() on both arguments above is making a lot of assumptions.

I agree though that the different types of class objects should provide a consistent interface. Getting the name of a class is another example of a useful abstraction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants