Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't let deletions leak from OverrideEnvironment #4574

Merged
merged 3 commits into from
Sep 16, 2024

Conversation

mwichmann
Copy link
Collaborator

@mwichmann mwichmann commented Jul 15, 2024

EDITED

This PR changes two things about OverrideEnvirionment:

  1. The existing __del__ method removed its argument from the override dict, and from the construction variable dict in the underlying environment ("subject"). There's "method to the madness": if the key is subsequently looked up, it would otherwise be supplied from the subject, leading to the weird outcome that you've deleted a variable, yet that variable is still in the environment you're looking at (which you don't know is a "special" OverrideEnvironment). Deleting from the subject makes that not happen, but has the side effect the variable is really deleted from the subject, which could lead to surprise elsewhere. The change is to track deletes from an OE, and use that information to make sure you don't supply a value from the subject in the OE context, but without actually modifying the subject.

  2. Many environment methods directly access/modify self._dict, the dictionary of construction variables. Environments have special methods that make them "work like a dict", by accessing _dict, but it's slightly faster to access __dict directly. In the existing code, if using an OverrideEnvironment instance, such a method will be supplied from the subject environment via proxying (__getattr) - as intended. But now self is an OE, which doesn't have a _dict attribute, so that's revectored through __getattr__ again, which supplies the attribute from the subject environment. Thus, these methods will end up taking the value from, or modifying the value in, the subject environment's _dict, even if there was already an entry in the override dict for it which should be used. The change makes OverrideEnvironment pretend it has a _dict, such that accesses go through OE's existing __getitem__ and __setitem__ methods and thus consider the override dict first.

Also in the rpm packaging tool: call Override factory method rather than directly instantiating OverrideEnvironment ("use best practices")

Fixes #4563

Contributor Checklist:

  • I have created a new test or updated the unit tests to cover the new/changed functionality.
  • I have updated CHANGES.txt (and read the README.rst)
  • I have updated the appropriate documentation

@bdbaddog
Copy link
Contributor

Would it be possible to prevent this by only making changes in OverrideEnvironment()?
That way there'd be zero performance hit for non OverrideEnvironment() usage, and potentially none for it's parent Environment Objects envvar access?

@mwichmann
Copy link
Collaborator Author

Not sure I'm seeing an issue? The track-deletes behavior is only in OverrideEnvironment.

@bdbaddog
Copy link
Contributor

Not sure I'm seeing an issue? The track-deletes behavior is only in OverrideEnvironment.

Anywhere you've replaced self._dict with self['xyz]. Is a change to regular environment and potentially a performance impact for non OverrideEnvironment()

@mwichmann
Copy link
Collaborator Author

mwichmann commented Jul 18, 2024

I did most of those as a separate commit so it's easy to roll back. But... for self['xyz'], if it's a regular env, then it uses SubstitutionEnvironment's __getitem__ and __setitem__, which are simple, like:

    def __getitem__(self, key): 
        return self._dict[key]

and if it's an OverrideEnvironment it uses the one defined there that handles the multiple sources (and now deletions). So it doesn't seem to be adding anything beyond a very simple function call.

@bdbaddog
Copy link
Contributor

I did most of those as a separate commit so it's easy to roll back. But... for self['xyz'], if it's a regular env, then it uses SubstitutionEnvironment's __getitem__ and __setitem__, which are simple, like:

    def __getitem__(self, key): 
        return self._dict[key]

and if it's an OverrideEnvironment it uses the one defined there that handles the multiple sources (and now deletions). So it doesn't seem to be adding anything beyond a very simple function call.

I'm thinking that any functions which were falling through to the Environment() class which should be checking the overrides should be implemented in Override() and not use the parent's version so it doesn't impact that class for the benefit of it's child class?

@mwichmann
Copy link
Collaborator Author

there's potentially a lot (consider all the appends and path fiddlers and so on), it pretty much defeats the proxy idea. But... once it turned out I had to "fake" OverrideEnvironment having a _dict anyway, the old way will work, though it violates the principle that a method should work whether it runs in the context of the class it's defined in, or in the context of some derived class (or proxy) - and if it doesn't, the derived/proxy needs to implement its own.

@mwichmann
Copy link
Collaborator Author

Updated - please glance at the edited summary at the top, maybe more clear now. Had to force-push due to branch merging done since the original push getting things out of skew.

@bdbaddog
Copy link
Contributor

Updated - please glance at the edited summary at the top, maybe more clear now. Had to force-push due to branch merging done since the original push getting things out of skew.

Reviewed. This approach sounds great.
I'll take a pass through the code.

I'm try to be extra cautious with this area of code as changes can have significant and (sometimes) unexpected impact.

Previously, deleting from an OverrideEnvironment removed not just the
override item but also the same item in the subject (base) environment.
This prevents supplying the value from the subject, as that would be
contrary to the expectation after "I deleted this variable".  However,
this lets the override modify its subject, a form of leakage we don't
want. Now a deleted item has its key recorded to prevent refilling
later. Direct assignment will still set the item back in the override.

rpm packaging tool: call Override factory method rather than directly
instantiating OverrideEnvironment ("best practices")

Signed-off-by: Mats Wichmann <[email protected]>
@mwichmann
Copy link
Collaborator Author

Squahsed and rebased on 4.8.1 as a base.

def __getattr__(self, name):
# Proxied environment methods don't know they could be called with
# us as 'self' and may access the _data consvar dict directly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about instead of us as 'self', an OverrideEnvironment as 'self' ? Or am I misunderstanding this comment?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means - in a function that happens to be written as a method of SubstitutionEnvironment or Base, you can't make assumptions about what kind of object is passed as self, it could easily be an OverrideEnvironment, which isn't even in the class hierarchy. Yes, could spell it out. I'll also have to eventually revise the CHANGES entry, since some of what it says got undone, I just noticed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated both.

@bdbaddog bdbaddog merged commit 5f0e934 into SCons:master Sep 16, 2024
7 of 8 checks passed
@mwichmann mwichmann added this to the NextRelease milestone Sep 16, 2024
@mwichmann mwichmann deleted the env/override-del branch September 16, 2024 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Complete
Development

Successfully merging this pull request may close these issues.

OverrideEnvironment problems with env._dict
2 participants