Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce Proxy functionality #52

Closed
wants to merge 1 commit into from
Closed

introduce Proxy functionality #52

wants to merge 1 commit into from

Conversation

ryanfkeepers
Copy link
Contributor

Ads a new feature to clues: proxies. A proxy is a way to ensure clues additions are not lost in cases where downstream or adhoc funcs are unable to retrieve those values or pass them back through a clues.Err. By adding a proxy, a higher level instance also receives downstream additions.

@ryanfkeepers ryanfkeepers added the enhancement New feature or request label Jul 16, 2024
@ryanfkeepers ryanfkeepers self-assigned this Jul 16, 2024
Base automatically changed from update-err-trace to main July 16, 2024 23:52
Ads a new feature to clues: proxies.  A proxy is a way to ensure clues
additions are not lost in cases where downstream or adhoc funcs are
unable to retrieve those values or pass them back through a clues.Err.
By adding a proxy, a higher level instance also receives downstream
additions.
Comment on lines +253 to +257
nc := nodeFromCtx(ctx, defaultNamespace)
nn := nc.addValues(nil)
nn = nn.addProxy(proxyID, isolateProxyValues)

return setDefaultNodeInCtx(ctx, nn)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: seems like it would be simpler to just call AddProxyTo with the default namespace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps. It's just a standard pattern. And, to be honest, I'm not sure if there's any value on keeping the namespaces around. Afaik they're not used. And I'm now no longer sure what their utility would be.

id: makeNodeID(),
values: m,
},
data: &dataNode{values: m},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this and the call below now going to have an empty ID in the node?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes they will, which is expected.

// generate a proxy
proxy := &dataNode{}

// proxies maintain their own tree, so that we can walk the.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: minor changes for readability

Suggested change
// proxies maintain their own tree, so that we can walk the.
// Proxies maintain their own tree, so that we can walk the

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh, bad habits.

// If any descendant of the primary dataNode attaches another proxy, that new
// proxy becomes a descenant of the currently existing proxy. This way, when we
// add a value to a proxy, we are able to walk its ascendancy tree to add the same
// valus to all ancestory proxies, thereby guaranteeing that every aded proxy gets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// valus to all ancestory proxies, thereby guaranteeing that every aded proxy gets
// values to all ancestry proxies, thereby guaranteeing that every added proxy gets

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, three in one sentence? Even an extra vowel? That's a new record!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure I can beat you on this in due time

@@ -224,3 +224,63 @@ func AddLabelCounterTo(

return setNodeInCtx(ctx, namespace, nn)
}

// ---------------------------------------------------------------------------
// comments
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could you please fix this copy/paste issue?

Suggested change
// comments
// proxies

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

d'oh, thanks.

"1": 1,
"2": 2,
"p1_2": 2,
"p1_3": 3,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could turn into a point of confusion that makes it difficult to track how proxied values will appear. For instance, this value was added via proxy p2, but shows up with the prefix p1. So we know longer know that it came from p2 nor is it apparent how it got to this context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think the test variables might be making this confusing. We're checking ctx2, which does not contain a reference to proxy2. So when we extract values from the context, it's all proxy1, because proxy2 doesn't exist in the extraction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, but the call chain to actually get data into this proxy is
ctx3 -> pxy2 -> ctx2 -> pxy

The only reason these values exist in pxy and ctx2 is because they passed through pxy2. However, there's no information saying that they passed through pxy2 (i.e. in the form of a prefix representing a namespace) because all prefixes are local to the current proxy in the node. Therefore, there's no way to track which/how many proxies a value may have gone through, making it difficult to determine where the value may have been added (i.e. the data could have been added in any of the functions in the transitive closure of functions called from the function reading data out of the proxy and there's no additional data in the proxy to help narrow that down)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making it difficult to determine where the value may have been added

I suppose that has never been a concern here. The pitch is that "the proxy gathers all data downstream of its addition". There is no condition preventing additions of data following the addition of another proxy (more downsides with that choice, relatively). Caring about where, downstream, the proxy gained a value would be a bit counter to its purpose.

Now, if your mind is around what happens within larger scale proxy trees, and the possibility of sibling value clobbering... yeah, I can see that as a problem. But it goes back to the (difficult to enforce) misuse of the design.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caring about where, downstream, the proxy gained a value would be a bit counter to its purpose.

mmm I'm not sure if it's caring as much as the desire to be able to use where downstream it was added as a way to filter the search for the code in question. Having proxied values can help with log messages that are output higher up the stack, but at some point we may need to examine the code that added the proxied values and if there's no way to tell them apart then the search space can be large

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair points

// unable to later retrieve the ctx or to bind the ctx values into an error.
// This case can manifest when passing ad-hoc functions or middleware to
// callers who limit options to control for error returns.
func AddProxyTo(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

placing comment here because I dislike GitHub's lack of threading for comments on the review itself...

I think in general the implementation looks good, though I'm on the fence about the feature. My hesitancy comes from two main things:

  1. there's no easy way to describe how proxied values will appear in the output (as in what prefixes they'll get, how overrides work with multiple children, etc)
  2. there's no way to remove proxied data from a node

I think that this could put us in a position where the logs we need to read through contain too much information, and thus become less useful. We've gotten good at adding a lot of data to the context, but that's going to come back to hurt us here because we have no way to control what data is proxied, and to where. In essence, this is hoping that called functions know exactly what the caller is doing and when it's ok to add extra info and when it's not. Without that info, it will become easy to just assume it's safe to proxy things and end up overwhelmed by the amount of data we need to dig through. We may also start to clobber data we actually want to keep because a called function proxied something that just happened to have the same key as something in the parent (e.x. id is pretty common to use for keys)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because I dislike GitHub's lack of threading for comments on the review itself

Yo, tell me about it. Why can't we just have threads on the main review? Makes no sense.

there's no way to remove proxied data from a node

permanently? Or as in, you'd like to be able to fetch the values without including the proxies.

In essence, this is hoping that called functions know exactly what the caller is doing and when it's ok to add extra info and when it's not.

Ah, the goal was to make that concern not appear. Is there some inherent fuzziness to the data addition that isn't present in other ctx additions? Yeah, unfortunately there is. But this is also designed for an extremely weird corner case situation. On which note...

it will become easy to just assume it's safe to proxy things and end up overwhelmed by the amount of data we need to dig through

I am, honesty, riding a bit on the expectation that Proxies are almost never useful, and when they are useful it's for a very small and encapsulated set of data. I do share the concern that this can blow out metadata in unwanted ways, especially if used haphazardly. But there's so little reason to use it, that it may balance out in the end? Fingers crossed?

We may also start to clobber data we actually want to keep because a called function proxied something that just happened to have the same key as something in the parent (e.x. id is pretty common to use for keys)

That was why proxies get namespaces by default. In fact, the option to un-namespace a proxy was an offbeat afterthought. Maybe that's unnecessary and a footgun more than it is helpful.

Your response is a bit balanced between two worries: too much data, and the accidental data clobbering. I think the latter is relatively a non-issue due to proxy value isolation (I think I'll do away with non-isolated variations). The former is, well, like stated, based on minimal usage as a strategy.

Here's a pitch, what if this was more of a producer:consumer pairing. I could introduce a func like clues.AddToProxy(...) that adds the value to both the ctx and (if one exists) the proxy handed to the context. It feels overbearing. But also feels like it could solve your concerns. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no way to remove proxied data from a node

permanently? Or as in, you'd like to be able to fetch the values without including the proxies.

yea, permanently. For example, if you add a proxy to something so web requests can return enough info in logs but the context the proxy was added to is also used for a whole bunch of other things after making the web requests there's no way to remove the proxied info for the "whole bunch of other things" after the part where you actually wanted the proxy. I suppose you could stop using the context that has the proxy at some point, but then the question becomes "why is there one context for half of this function and a different context for the other half?"

it will become easy to just assume it's safe to proxy things and end up overwhelmed by the amount of data we need to dig through

I am, honesty, riding a bit on the expectation that Proxies are almost never useful, and when they are useful it's for a very small and encapsulated set of data. I do share the concern that this can blow out metadata in unwanted ways, especially if used haphazardly. But there's so little reason to use it, that it may balance out in the end? Fingers crossed?

While I can understand the hope, it may not play out that way. This is something that won't have a correctness or (hopefully) perf impact, so it's easy to misuse. The fact that it's difficult to explain the full behavior of, not intended to be used much, and is difficult to track through the codebase (i.e. called functions must know their parents added a proxy and must know to be careful about how many clues they add) makes it likely it will be misused

That was why proxies get namespaces by default. In fact, the option to un-namespace a proxy was an offbeat afterthought. Maybe that's unnecessary and a footgun more than it is helpful.

It's still possible for two proxied values to clobber each other, even with the namespacing

I could introduce a func like clues.AddToProxy(...) that adds the value to both the ctx and (if one exists) the proxy handed to the context

This could help alleviate some of the concerns, but there's still the chance of clobbering proxied values if the same key is used. For example, a call chain of 3 functions, all of which add proxied values with the same key

Copy link
Contributor Author

@ryanfkeepers ryanfkeepers Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could help alleviate some of the concerns, but there's still the chance of clobbering proxied values if the same key is used. For example, a call chain of 3 functions, all of which add proxied values with the same key

I think I'm going to refactor with this design in mind. If we come back with needing a broad scale proxy bucket, we can come back to this design again.

As for clobbering: last-value-wins is part and parcel of the package. Not much way around it, without having a system of user-facing checks and balances, which is definitely too heavy. In the case of proxies, I suppose I could ask callers to key their proxy on both the initialization and the data addition. That would provide strong guarantees about isolation. Jives well with producer-consumer relationships, too.

yea, permanently

I think this will be a non-issue in the v2 of the design.

While I can understand the hope, it may not play out that way

c'est la vie. better to protect users agains themselves, in these cases.

@ryanfkeepers
Copy link
Contributor Author

Closing due to concerns about scope and collision complexities. To be replaced with a more directed version with similar support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants